Re: [PATCH v2 2/3] sched/core: Forced idle accounting per-cpu

From: cruzzhao
Date: Fri Jan 14 2022 - 10:04:19 EST




在 2022/1/12 上午9:59, Josh Don 写道:
> On Tue, Jan 11, 2022 at 1:56 AM Cruz Zhao <CruzZhao@xxxxxxxxxxxxxxxxx> wrote:
>>
>> Accounting for "forced idle" time per cpu, which is the time a cpu is
>> forced to idle by its SMT sibling.
>>
>> As it's not accurate to measure the capacity loss only by cookie'd forced
>> idle time, and it's hard to trace the forced idle time caused by all the
>> uncookie'd tasks, we account the forced idle time from the perspective of
>> the cpu.
>>
>> Forced idle time per cpu is displayed via /proc/schedstat, we can get the
>> forced idle time of cpu x from the 10th column of the row for cpu x. The
>> unit is ns. It also requires that schedstats is enabled.
>>
>> Signed-off-by: Cruz Zhao <CruzZhao@xxxxxxxxxxxxxxxxx>
>> ---
>
> Two quick followup questions:
>
> 1) From your v1, I still wasn't quite sure if the per-cpu time was
> useful or not for you versus a single overall sum (ie. I think other
> metrics could be more useful for analyzing steal_cookie if that's what
> you're specifically interested in). Do you definitely want the per-cpu
> totals?
>
IMO, the per-cpu forced idle time can help us get to know whether the
forced idle time is uniform among the core, or among all the cpus. IMO,
it's a kind of balance.

> 2) If your cgroup accounting patch is merged, do you still want this
> patch? You can grab the global values from the root cgroup (assuming
> you have cgroups enabled). The only potential gap is if you need
> per-cpu totals, though I'm working to add percpu stats to cgroup-v2:
> https://lkml.kernel.org/r/%3C20220107234138.1765668-1-joshdon@xxxxxxxxxx%3E

If cgroup accounting patch is merged, this patch is still needed.

Consider the following scenario:
Task p of cgroup A is running on cpu x, and it forces cpu y into idle
for t ns. The forceidle time of cgroup A on cpu x increases t ns, and
the forcedidle time of cpu y increases t ns.

That is, the force idle time of cgroup is defined as the forced idle
time it caused, and the force idle time of cpu is defined as the time
the cpu is forced into idle, which have different meanings from each other.

And for SMT > 2, we cannot caculate the forced idle time of cpu x from
the cgroup interface.

Best,
Cruz Zhao