Re: [PATCH 2/2] sched: Implement interface for cgroup unified hierarchy

From: Peter Zijlstra
Date: Tue Aug 01 2017 - 17:40:57 EST


On Tue, Aug 01, 2017 at 01:17:45PM -0700, Tejun Heo wrote:

> > What about the whole double accounting thing? Because currently cpuacct
> > and cpu do a fair bit of duplication. It would be very good to get rid
> > of that.
>
> I'm not that sure at this point. Here are my current thoughts on
> cpuacct.
>
> * It is useful to have basic cpu statistics on cgroup without having
> to enable the cpu controller, especially because enabling cpu
> controller always changes how cpu cycles are distributed and
> currently comes at some performance overhead.
>
> * On cgroup2, there is only one hierarchy. It'd be great to have
> basic resource accounting enabled by default on all cgroups. Note
> that we couldn't do that on v1 because there could be any number of
> hierarchies and the cost would increase with the number of
> hierarchies.

Yes, the whole single hierarchy thing makes doing away with the double
accounting possible.

> * It is bothersome that we're walking up the tree each time for
> cpuacct although being percpu && just walking up the tree makes it
> relatively cheap.

So even if its only CPU local accounting, you still have all the pointer
chasing and misses, not to mention that a faster O(depth) is still
O(depth).

> Anyways, I'm thinking about shifting the
> aggregation to the reader side so that the hot path always only
> updates local counters in a way which can scale even when there are
> a lot of (idle) cgroups. Will follow up on this later.

Not entirely sure I follow, we currently only update the current cgroup
and its immediate parents, no? Or are you looking to only account into
the current cgroup and propagate into the parents on reading?