Re: [PATCH] sched/core: add forced idle accounting for cgroups

From: Josh Don
Date: Fri May 13 2022 - 15:27:35 EST


Thanks Tejun,

On Thu, May 12, 2022 at 7:58 PM Tejun Heo <tj@xxxxxxxxxx> wrote:
>
> On Thu, May 12, 2022 at 05:54:27PM -0700, Josh Don wrote:
> > 4feee7d1260 previously added per-task forced idle accounting. This patch
> > extends this to also include cgroups.
> >
> > rstat is used for cgroup accounting, except for the root, which uses
> > kcpustat in order to bypass the need for doing an rstat flush when
> > reading root stats.
> >
> > Only cgroup v2 is supported. Similar to the task accounting, the cgroup
> > accounting requires that schedstats is enabled.
>
> We've been collecting scheduler stats in cgroup core so that we always have
> them available whether cpu controller is enabled or not. There's nothing
> actually specific to cpu controller, right? Would it make sense to collect
> the cpu core stats the same way as the rest of scheduler stats?

Yea, that's right, this doesn't require the cpu controller to be
enabled. Are you suggesting to add a new field to cgroup_base_stat?

One other weird artifact of collecting forceidle time is that a cpu
may account it on behalf of its hyperthread sibling. Currently, the
core rstat code always accounts to the current cpu's percpu rstat
field. I can add an accounting function to support writes to a
different cpu's field, in order to make sure that the per-cpu totals
are correct (the forceidle accounting code holds rq->__lock, which
protects all HT siblings of a core). percpu totals aren't currently
exported in cgroup v2, but this is useful information that we'll
consume, so it would be nice to keep it accurate.

Best,
Josh