[PATCH v2 0/3] Accounting forced idle time per cpu and per cgroup

From: Cruz Zhao
Date: Tue Jan 11 2022 - 04:56:15 EST


There are two types of forced idle time: forced idle time from cookie'd
task and forced idle time form uncookie'd task. The forced idle time from
uncookie'd task is actually caused by the cookie'd task in runqueue
indirectly, and it's more accurate to measure the capacity loss with the
sum of both.

This patch set accounts forced idle time for each cpu to measure how long
the cpu is forced idle, which is displayed via via /proc/schedstat, and
also accounts for each cgroup to measure how long it forced its SMT siblings
into idle, which is displayed via /sys/fs/cgroup/cpuacct/cpuacct.forceidle
and /sys/fs/cgroup/cpuacct/cpuacct.forceidle_percpu. It is worth noting that
the forced idle time and the force idle time have different meanings.

We can get the total system forced idle time by looking at the root cgroup,
and we can get how long the cgroup forced it SMT siblings into idle. If the
force idle time of a cgroup is high, that can be rectified by making some
changes(ie. affinity, cpu budget, etc.) to the cgroup.

Cruz Zhao (3):
sched/core: Accounting forceidle time for all tasks except idle task
sched/core: Forced idle accounting per-cpu
sched/core: Force idle accounting per cgroup

include/linux/cgroup.h | 7 +++++
kernel/sched/core.c | 10 ++++--
kernel/sched/core_sched.c | 10 ++++--
kernel/sched/cpuacct.c | 79 +++++++++++++++++++++++++++++++++++++++++++++++
kernel/sched/sched.h | 4 +++
kernel/sched/stats.c | 17 ++++++++--
6 files changed, 119 insertions(+), 8 deletions(-)

--
1.8.3.1