Re: [PATCH v4 09/16] sched/core: uclamp: map TG's clamp values into CPU's clamp groups

From: Suren Baghdasaryan
Date: Wed Sep 12 2018 - 11:53:46 EST


On Wed, Sep 12, 2018 at 7:19 AM, Patrick Bellasi
<patrick.bellasi@xxxxxxx> wrote:
> On 09-Sep 11:52, Suren Baghdasaryan wrote:
>> On Tue, Aug 28, 2018 at 6:53 AM, Patrick Bellasi
>> <patrick.bellasi@xxxxxxx> wrote:
>
> [...]
>
>> > +/**
>> > + * release_uclamp_sched_group: release utilization clamp references of a TG
>>
>> free_uclamp_sched_group
>
> +1
>
>> > + * @tg: the task group being removed
>> > + *
>> > + * An empty task group can be removed only when it has no more tasks or child
>> > + * groups. This means that we can also safely release all the reference
>> > + * counting to clamp groups.
>> > + */
>> > +static inline void free_uclamp_sched_group(struct task_group *tg)
>> > +{
>
> [...]
>
>> > @@ -1417,9 +1444,18 @@ static void __init init_uclamp(void)
>> > #ifdef CONFIG_UCLAMP_TASK_GROUP
>> > /* Init root TG's clamp group */
>> > uc_se = &root_task_group.uclamp[clamp_id];
>> > +
>> > uc_se->effective.value = uclamp_none(clamp_id);
>> > - uc_se->value = uclamp_none(clamp_id);
>> > - uc_se->group_id = 0;
>> > + uc_se->effective.group_id = 0;
>> > +
>> > + /*
>> > + * The max utilization is always allowed for both clamps.
>> > + * This is required to not force a null minimum utiliation on
>> > + * all child groups.
>> > + */
>> > + uc_se->group_id = UCLAMP_NOT_VALID;
>> > + uclamp_group_get(NULL, clamp_id, 0, uc_se,
>> > + uclamp_none(UCLAMP_MAX));
>>
>> I don't quite get why you are using uclamp_none(UCLAMP_MAX) for both
>> UCLAMP_MIN and UCLAMP_MAX clamps. I assume the comment above is to
>> explain this but I'm still unclear why this is done.
>
> That's maybe a bit tricky to get but, this will not happen since for
> root group tasks we apply the system default values... which however
> are introduced by one of the following patches 11/16.
>
> So, my understanding of the "delegation model" is that for cgroups we
> have to ensure each TG is a "restriction" of its parent. Thus:
>
> tg::util_min <= tg_parent::util_min
>
> This is required to ensure that a tg_parent can always restrict
> resources on its descendants. I guess that's required to have a sane
> usage of CGroups for VMs where the Host can transparently control its
> Guests.
>
> According to the above rule, and considering that root task group
> cannot be modified, to allow boosting on TG we are forced to set the
> root group with util_min = SCHED_CAPACITY_SCALE.
>
> Moreover, Tejun pointed out that if we need tuning at root TG level,
> it means that we need system wide tunable, which should be available
> also when CGroups are not in use.
>
> That's why on patch:
>
> [PATCH v4 11/16] sched/core: uclamp: add system default clamps
> https://lore.kernel.org/lkml/20180828135324.21976-12-patrick.bellasi@xxxxxxx/
>
> we add the concept of system default clamps which are actually
> initialized with util_min=0, i.e. 0% boost by default.
>
> These system default clamp values applies to tasks which are running
> either in the root task group on in an autogroup, which also cannot be
> tuned at run-time, whenever the task has not a task specific clamp
> value specified.
>
> All that considered, the code above is still confusing and I should
> consider moving to patch 11/16 the initialization to UCLAMP_MAX for
> util_min...
>
>> Maybe expand the comment to explain the intention?
>
> ... and add there something like:
>
> /*
> * The max utilization is always allowed for both clamps.
> * This satisfies the "delegation model" required by CGroups
> * v2, where a child task group cannot have more resources then
> * its father, thus allowing the creation of child groups with
> * a non null util_min.
> * For tasks within the root_task_group we will use the system
> * default clamp values anyway, thus they will not be boosted
> * to the max utilization by default.
> */
>
> It this more clear ?

Yes, I think so. Thanks for covering that.

>
>
>> With this I think all TGs will get boosted by default, won't they?
>
> You right, at cgroup creation time we clone parent's clamps... thus,
> all root_task_group's children group will get max boosting at creation
> time. However, since we don't have task within a newly created task
> group, the system management software can still refine the clamps
> before staring to move tasks in there.
>
> Do you think we should initialize root task group childrens
> differently ? I would prefer to avoid special cases if not strictly
> required...

I don't see a problem with the current approach.

>
> Cheers,
> Patrick
>
> --
> #include <best/regards.h>
>
> Patrick Bellasi