Re: [PATCH] sched/uclamp: Introduce a method to transform UCLAMP_MIN into BOOST

From: Qais Yousef
Date: Tue Jul 27 2021 - 09:45:18 EST


Hi Xuewen

On 07/27/21 20:16, Xuewen Yan wrote:
> Hi Qais
>
> On Tue, Jul 27, 2021 at 1:17 AM Qais Yousef <qais.yousef@xxxxxxx> wrote:
> >
> > > > >
> > > > > The uclamp can clamp the util within uclamp_min and uclamp_max,
> > > > > it is benifit to some tasks with small util, but for those tasks
> > > > > with middle util, it is useless.
> >
> > It's not really useless, it works as it's designed ;-)
>
> Yes, my expression problem...

No worries, I understood what you meant. But I had to highlight that this is
the intended design behavior :-)

>
> >
> > As Dietmar highlighted, you need to pick a higher boost value that gives you
> > the best perf/watt for your use case.
> >
> > I assume that this is a patch in your own Android 5.4 kernel, right? I'm not
>
> Yes, the patch indeed is used in my own Android12 with kernel5.4.
>
> > aware of any such patch in Android Common Kernel. If it's there, do you mind
> > pointing me to the gerrit change that introduced it?
>
> emmm, sorry I kind of understand what that means. Your means is what
> I need to do is to send this patch to google?

Oh no. I meant if you are *not* carrying this patch in your own, I'd appreciate
getting a link to when it was merged into Google' tree. But you already said
you carry this patch on your own kernel, so there's nothing to do :)

>
> >
> > > Because the kernel used in Android do not have the schedtune, and the
> > > uclamp can not
> > > boost all the util, and this is the reason for the design of the patch.
> >
> > Do you have a specific workload in mind here that is failing? It would help if
> > you can explain in detail the mode of failure you're seeing to help us
> > understand the problem better.
>
> The patch has has been working with me for a while, I can redo this
> data, but this might take a while :)

But there must have been a reason you introduced it in the first place, what
was that reason?

>
> > >
> > > >
> > > > > Scenario:
> > > > > if the task_util = 200, {uclamp_min, uclamp_max} = {100, 1024}
> > > > >
> > > > > without patch:
> > > > > clamp_util = 200;
> > > > >
> > > > > with patch:
> > > > > clamp_util = 200 + (100 / 1024) * (1024 - 200) = 280;
> >
> > If a task util was 200, how long does it take for it to reach 280? Why do you
> > need to have this immediate boost value applied and can't wait for this time to
> > lapse? I'm not sure, but ramping up by 80 points shouldn't take *that* long,
> > but don't quote me on this :-)
>
> Here is just one example to illustrate that , with this patch, It also
> can boost the util which in {UCLAMP_MIN, UCLAMP_MAX}...
>
> >
> > > >
> > > > The same could be achieved by using {uclamp_min, uclamp_max} = {280, 1024}?
> > >
> > > Yes, for per-task, that is no problem, but for per-cgroup, most times,
> > > we can not always only put the special task into the cgroup.
> > > For example, in Android , there is a cgroup named "top-app", often all
> > > the threads of a app would be put into it.
> > > But, not all threads of this app need to be boosted, if we set the
> > > uclamp_min too big, the all the small task would be clamped to
> > > uclamp_min,
> > > the power consumption would be increased, howerever, if setting the
> > > uclamp_min smaller, the performance may be increased.
> > > Such as:
> > > a task's util is 50, {uclamp_min, uclamp_max} = {100, 1024}
> > > the boost_util = 50 + (100 / 1024) * (1024 - 50) = 145;
> > > but if we set {uclamp_min, uclamp_max} = {280, 1024}, without patch:
> > > the clamp_util = 280.
> >
> > I assume {uclamp_min, uclamp_max} = {145, 1024} is not good enough because you
> > want this 200 task to be boosted to 280. One can argue that not all tasks at
> > 200 need to be boosted to 280 too. So the question is, like above, what type
> > of tasks that are failing here and how do you observe this failure? It seems
> > there's a class of performance critical tasks that need this fast boosting.
> > Can't you identify them and boost them individually?
>
> Yes, the best way to do that is boosting them individually, but
> usually, it may not be so easy...

Yes I appreciate that, but cgroup is a coarse grain controller. Even with your
approach, you will still have to find the best compromise because some tasks
will get more boosting than they really need to and waste power even with your
approach.

For best outcome with uclamp; the cgroup should be used to specify the minimum
performance requirement of a class of tasks, then use the per-task API to fine
tune the settings for specific tasks.

I appreciate it'll take time to get there, but this is the best way forward.

If you have a specific use case that's failing, it will still be good to share
the details to think more if there's something we can do about it at the kernel
level.

Thanks

--
Qais Yousef