Although we can rely on cpuacct to present the cpu usage of task
group, it is hard to tell how intense the competition is between
these groups on cpu resources.
Monitoring the wait time of each process or sched_debug could cost
too much, and there is no good way to accurately represent the
conflict with these info, we need the wait time on group dimension.
Thus we introduced group's wait_sum represent the conflict between
task groups, which is simply sum the wait time of group's cfs_rq.
The 'cpu.stat' is modified to show the statistic, like:
 nr_periods 0
 nr_throttled 0
 throttled_time 0
 wait_sum 2035098795584
Now we can monitor the changing on wait_sum to tell how suffering
a task group is in the fight of cpu resources.
For example:
 (wait_sum - last_wait_sum) * 100 / (nr_cpu * period_ns) == X%
means the task group paid X percentage of period on waiting
for the cpu.
Signed-off-by: Michael Wang <yun.wang@xxxxxxxxxxxxxxxxx>
---
Since v1:
 Use schedstat_val to avoid compile error
 Check and skip root_task_group
Âkernel/sched/core.c | 8 ++++++++
Â1 file changed, 8 insertions(+)
diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 78d8fac..80ab995 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -6781,6 +6781,8 @@ static int __cfs_schedulable(struct task_group *tg, u64 period, u64 quota)
Âstatic int cpu_cfs_stat_show(struct seq_file *sf, void *v)
Â{
+ÂÂÂ int i;
+ÂÂÂ u64 ws = 0;
ÂÂÂÂ struct task_group *tg = css_tg(seq_css(sf));
ÂÂÂÂ struct cfs_bandwidth *cfs_b = &tg->cfs_bandwidth;
@@ -6788,6 +6790,12 @@ static int cpu_cfs_stat_show(struct seq_file *sf, void *v)
ÂÂÂÂ seq_printf(sf, "nr_throttled %d\n", cfs_b->nr_throttled);
ÂÂÂÂ seq_printf(sf, "throttled_time %llu\n", cfs_b->throttled_time);
+ÂÂÂ if (schedstat_enabled() && tg != &root_task_group) {
+ÂÂÂÂÂÂÂ for_each_possible_cpu(i)
+ÂÂÂÂÂÂÂÂÂÂÂ ws += schedstat_val(tg->se[i]->statistics.wait_sum);
+ÂÂÂÂÂÂÂ seq_printf(sf, "wait_sum %llu\n", ws);
+ÂÂÂ }
+
ÂÂÂÂ return 0;
Â}
Â#endif /* CONFIG_CFS_BANDWIDTH */