Re: [PATCH] sched/uclamp: Avoid setting cpu.uclamp.min bigger than cpu.uclamp.max

From: Xuewen Yan
Date: Sat Jun 05 2021 - 09:27:27 EST


Hi Qais

On Sat, Jun 5, 2021 at 7:49 PM Qais Yousef <qais.yousef@xxxxxxx> wrote:
> > >
> > In addition,In your patch:
> > 6938840392c89 ("sched/uclamp: Fix wrong implementation of cpu.uclamp.min")
> > https://lkml.kernel.org/r/20210510145032.1934078-2-qais.yousef@xxxxxxx
> >
> > + switch (clamp_id) {
> > + case UCLAMP_MIN: {
> > + struct uclamp_se uc_min = task_group(p)->uclamp[clamp_id];
> > + if (uc_req.value < uc_min.value)
> > + return uc_min;
> > + break;
> >
> > When the clamp_id = UCLAMP_MIN, why not judge the uc_req.value is
> > bigger than task_group(p)->uclamp[UCLAMP_MAX] ?
>
> Because of the requirement I pointed you to in cgroup-v2.rst. We must allow any
> value to be requested.
>
> Ultimately if we had
>
> cpu.uclamp.min = 80
> cpu.uclamp.max = 50
>
> then we want to remember the original request but make sure the effective value
> is capped.
>
> For the user in the future modifies the values such that
>
> cpu.uclamp.max = max
>
> Then we want to remember cpu.uclamp.min = 80 and apply it since now the
> cpu.uclamp.max was relaxed to allow the boost value.
>
> > Because when the p->uclamp_req[UCLAMP_MIN] > task_group(p)->uclamp[UCLAMP_MAX],
> > the patch can not clamp the p->uclamp_req[UCLAMP_MIN/MAX] into
> > [ task_group(p)->uclamp[UCLAMP_MAX], task_group(p)->uclamp[UCLAMP_MAX] ].
> >
> > Is it necessary to fix it here?
>
> Nope. We must allow any combination values to be accepted and remember them so
> if one changes we ensure the new effective value is updated accordingly.
> This is how cgroups API works.

Sorry. I may not have expressed it clearly. In your patch (which has
not yet merged into the mainline):

6938840392c89 ("sched/uclamp: Fix wrong implementation of cpu.uclamp.min")
https://lkml.kernel.org/r/20210510145032.1934078-2-qais.yousef@xxxxxxx

This patch will not affect p->uclamp_req, but consider the following situation:

tg->cpu.uclamp.min = 0
tg->cpu.uclamp.max = 50%

p->uclamp_req[UCLAMP_MIN] = 60%
p->uclamp_req[UCLAMP_MIN] = 80%

The function call process is as follows:
uclamp_eff_value() -> uclamp_eff_get() ->uclamp_tg_restrict()

with your patch, the result is:

p->effective_uclamp_min = 60%
p->effective_uclamp_max = 50%

It would not affect the uclamp_task_util(p), but affect the rq:
when p enqueued:
rq->uclamp[UCLAMP_MIN] = 60%
rq->uclamp[UCLAMP_MIN] = 50%

futher more, in uclamp_rq_util_with() {
...

min_util = READ_ONCE(rq->uclamp[UCLAMP_MIN].value); //60%
max_util = READ_ONCE(rq->uclamp[UCLAMP_MAX].value);//50%
...
if (unlikely(min_util >= max_util))
return min_util;

return clamp(util, min_util, max_util);
...
}
as a result, it would return 60%.

Thanks!
xuewen