Re: [PATCH] sched/uclamp: Avoid setting cpu.uclamp.min bigger than cpu.uclamp.max
From: Qais Yousef
Date: Sat Jun 05 2021 - 07:49:15 EST
On 06/05/21 10:12, Xuewen Yan wrote:
> Hi Qais,
>
> On Sat, Jun 5, 2021 at 12:08 AM Qais Yousef <qais.yousef@xxxxxxx> wrote:
> >
> > On 06/03/21 10:24, Xuewen Yan wrote:
> > > +CC Qais
> >
> > Thanks for the CC :)
> >
> > >
> > >
> > > Hi Quentin
> > >
> > > On Wed, Jun 2, 2021 at 9:22 PM Quentin Perret <qperret@xxxxxxxxxx> wrote:
> > > >
> > > > +CC Patrick and Tejun
> > > >
> > > > On Wednesday 02 Jun 2021 at 20:38:03 (+0800), Xuewen Yan wrote:
> > > > > From: Xuewen Yan <xuewen.yan@xxxxxxxxxx>
> > > > >
> > > > > When setting cpu.uclamp.min/max in cgroup, there is no validating
> > > > > like uclamp_validate() in __sched_setscheduler(). It may cause the
> > > > > cpu.uclamp.min is bigger than cpu.uclamp.max.
> > > >
> > > > ISTR this was intentional. We also allow child groups to ask for
> > > > whatever clamps they want, but that is always limited by the parent, and
> > > > reflected in the 'effective' values, as per the cgroup delegation model.
> >
> > As Quentin said. This intentional to comply with cgroup model.
> >
> > See Limits and Protections sections in Documentation/admin-guide/cgroup-v2.rst
> >
> > Specifically
> >
> > "all configuration combinations are valid"
> >
> > So user can set cpu.uclamp.min higher than cpu.uclamp.max. But when we apply
> > the setting, cpu.uclamp.min will be capped by cpu.uclamp.max. I can see you
> > found the cpu_util_update_eff() logic.
> >
>
> Thanks a lot for your patience to explain, sorry for my ignorance of
> Documentation/admin-guide/cgroup-v2.rst.
No problem :)
>
> > >
> > > It does not affect the 'effective' value. That because there is
> > > protection in cpu_util_update_eff():
> > > /* Ensure protection is always capped by limit */
> > > eff[UCLAMP_MIN] = min(eff[UCLAMP_MIN], eff[UCLAMP_MAX]);
> > >
> > > When users set the cpu.uclamp.min > cpu.uclamp.max:
> > > cpu.uclamp.max = 50;
> > > to set : cpu.uclamp.min = 60;
> > > That would make the uclamp_req[UCLAMP_MIN].value = 1024* 60% = 614,
> > > uclamp_req[UCLAMP_MAX].value = 1024* 50% = 512;
> > > But finally, the uclamp[UCLAMP_MIN].value = uclamp[UCLAMP_MAX].value
> > > = 1024* 50% = 512;
> > >
> > > Is it deliberately set not to validate because of the above?
> >
> > Sorry I'm not following you here. What code paths were you trying to explain
> > here?
> >
> > Did you actually hit any problem here?
>
> I just gave an example of the difference of uclamp_req and uclamp
> without my patch, and can ignore it.
Cool.
>
> >
> In addition,In your patch:
> 6938840392c89 ("sched/uclamp: Fix wrong implementation of cpu.uclamp.min")
> https://lkml.kernel.org/r/20210510145032.1934078-2-qais.yousef@xxxxxxx
>
> + switch (clamp_id) {
> + case UCLAMP_MIN: {
> + struct uclamp_se uc_min = task_group(p)->uclamp[clamp_id];
> + if (uc_req.value < uc_min.value)
> + return uc_min;
> + break;
>
> When the clamp_id = UCLAMP_MIN, why not judge the uc_req.value is
> bigger than task_group(p)->uclamp[UCLAMP_MAX] ?
Because of the requirement I pointed you to in cgroup-v2.rst. We must allow any
value to be requested.
Ultimately if we had
cpu.uclamp.min = 80
cpu.uclamp.max = 50
then we want to remember the original request but make sure the effective value
is capped.
For the user in the future modifies the values such that
cpu.uclamp.max = max
Then we want to remember cpu.uclamp.min = 80 and apply it since now the
cpu.uclamp.max was relaxed to allow the boost value.
> Because when the p->uclamp_req[UCLAMP_MIN] > task_group(p)->uclamp[UCLAMP_MAX],
> the patch can not clamp the p->uclamp_req[UCLAMP_MIN/MAX] into
> [ task_group(p)->uclamp[UCLAMP_MAX], task_group(p)->uclamp[UCLAMP_MAX] ].
>
> Is it necessary to fix it here?
Nope. We must allow any combination values to be accepted and remember them so
if one changes we ensure the new effective value is updated accordingly.
This is how cgroups API works.
Hope this makes sense.
Cheers
--
Qais Yousef