Re: [PATCH v3 09/14] sched/core: uclamp: propagate parent clamps

From: Dietmar Eggemann
Date: Fri Aug 17 2018 - 11:50:38 EST

On 08/17/2018 04:45 PM, Patrick Bellasi wrote:
On 17-Aug 15:43, Dietmar Eggemann wrote:
On 08/06/2018 06:39 PM, Patrick Bellasi wrote:
In order to properly support hierarchical resources control, the cgroup
delegation model requires that attribute writes from a child group never
fail but still are (potentially) constrained based on parent's assigned
resources. This requires to properly propagate and aggregate parent
attributes down to its descendants.

I don't understand the reason mentioned here:

IMHO, a write to a child's (tg1/tg11) cpu.rt_runtime_us can fail if the
value is restricted by the parents value:

Well... that's my interpretation after this discussion:

So cgroups v2 uses .effective files for config propagation. Didn't know that.

AFAIU, what has not to fail is a write to a parent, which wants to enforce
more restrictive constraints to child groups. Thus, if we have for example:

tg1: util_max=100%
tg1/tg11: util_max=80%

It should be possible without errors to set:

tg1: util_max=50%

and then enforce a 50% util_max to tg1/tg11 tasks too and eventually
use "effective" attributes to expose the effective value used at each
level of the hierarchy.

Ok, your example makes sense. But the text above says 'that attribute writes from a child group never fail but still are ...'. So this is a little bit different.

I guess with the knowledge that this is by default cgroups v2 and that config propagation is implemented via the .effective files it's digestible.

root@juno:/sys/fs/cgroup/cpu# cat cpu.rt_*
root@juno:/sys/fs/cgroup/cpu# cat tg1/cpu.rt_*
root@juno:/sys/fs/cgroup/cpu# cat tg1/tg11/cpu.rt_*
root@juno:/sys/fs/cgroup/cpu# echo 950000 > tg1/tg11/cpu.rt_runtime_us
-bash: echo: write error: Invalid argument
root@juno:/sys/fs/cgroup/cpu# echo 950000 > tg1/cpu.rt_runtime_us
root@juno:/sys/fs/cgroup/cpu# echo 950000 > tg1/tg11/cpu.rt_runtime_us

This example is using the legacy hierarcy (cgroups v1).

Yeah, so your patches take unified (v2) as default.

AFAIK the default hierarcy (cgroups v2) has a much more stricy set of
requirements for the "delegation model".

Could be ... I guess I have to study this more.


I assume here that the cpu.util.{min,max} of the child will not be used any
more because the 'effective' counterparts are taken instead.

Yes, the "effective" attributes are the one used in kernel space for
the actual clamping.

However, the cpu.util.{min,max} of a child are still required as soon
as the parent relax its constraints... when we use their value to
set the "effective" value.

Yes, with the new background this make sense.

I wonder if this propagation not been provided with only cpu.util.{min,max}?

In the example before, if we use the same variables we miss the
opportunity to reset:

tg1/tg11: util_max=80%

as soon as tg1's util_max goes back to 100%.

Yes, from the config propagation point of view this should be pretty close to the v2 cpuset controller from Waiman Long.

Maybe mentioning that these .effective files are the 'standard' way to implement proper config propagation in cgroups v2 would help understanding this patch.