Re: [PATCH] sched/uclamp: Introduce a method to transform UCLAMP_MIN into BOOST

From: Qais Yousef
Date: Mon Jul 26 2021 - 13:18:16 EST


Hi Xuewen

On 07/24/21 10:03, Xuewen Yan wrote:
> On Fri, Jul 23, 2021 at 11:19 PM Dietmar Eggemann
> <dietmar.eggemann@xxxxxxx> wrote:
> >
> > On 21/07/2021 09:57, Xuewen Yan wrote:
> > > From: Xuewen Yan <xuewen.yan@xxxxxxxxxx>
> > >
> > > The uclamp can clamp the util within uclamp_min and uclamp_max,
> > > it is benifit to some tasks with small util, but for those tasks
> > > with middle util, it is useless.

It's not really useless, it works as it's designed ;-)

As Dietmar highlighted, you need to pick a higher boost value that gives you
the best perf/watt for your use case.

> > >
> > > To speed up those tasks, convert UCLAMP_MIN to BOOST,
> > > the BOOST as schedtune does:
> >
> > Maybe it's important to note here that schedtune is the `out-of-tree`
> > predecessor of uclamp used in some Android versions.
>
> Yes, and the patch is indeed used on Android which kernel version is 5.4+.

I assume that this is a patch in your own Android 5.4 kernel, right? I'm not
aware of any such patch in Android Common Kernel. If it's there, do you mind
pointing me to the gerrit change that introduced it?

> Because the kernel used in Android do not have the schedtune, and the
> uclamp can not
> boost all the util, and this is the reason for the design of the patch.

Do you have a specific workload in mind here that is failing? It would help if
you can explain in detail the mode of failure you're seeing to help us
understand the problem better.

>
> >
> > > boot = uclamp_min / SCHED_CAPACITY_SCALE;
> > > margin = boost * (uclamp_max - util)
> > > boost_util = util + margin;
> >
> > This is essentially the functionality from schedtune_margin() in
> > Android, right?
>
> YES!
>
> >
> > So in your implementation, the margin (i.e. what is added to the task
> > util) not only depends on uclamp_min, but also on `uclamp_max`?
>
> Yes, because we do not want to convert completely the uclamp to schedtune,
> we also want user can limit some tasks, so the UCLAMP_MAX's meaning
> has not been changed,
> meanwhile, the UCLAMP_MAX also can affect the margin.
>
> >
> > > Scenario:
> > > if the task_util = 200, {uclamp_min, uclamp_max} = {100, 1024}
> > >
> > > without patch:
> > > clamp_util = 200;
> > >
> > > with patch:
> > > clamp_util = 200 + (100 / 1024) * (1024 - 200) = 280;

If a task util was 200, how long does it take for it to reach 280? Why do you
need to have this immediate boost value applied and can't wait for this time to
lapse? I'm not sure, but ramping up by 80 points shouldn't take *that* long,
but don't quote me on this :-)

> >
> > The same could be achieved by using {uclamp_min, uclamp_max} = {280, 1024}?
>
> Yes, for per-task, that is no problem, but for per-cgroup, most times,
> we can not always only put the special task into the cgroup.
> For example, in Android , there is a cgroup named "top-app", often all
> the threads of a app would be put into it.
> But, not all threads of this app need to be boosted, if we set the
> uclamp_min too big, the all the small task would be clamped to
> uclamp_min,
> the power consumption would be increased, howerever, if setting the
> uclamp_min smaller, the performance may be increased.
> Such as:
> a task's util is 50, {uclamp_min, uclamp_max} = {100, 1024}
> the boost_util = 50 + (100 / 1024) * (1024 - 50) = 145;
> but if we set {uclamp_min, uclamp_max} = {280, 1024}, without patch:
> the clamp_util = 280.

I assume {uclamp_min, uclamp_max} = {145, 1024} is not good enough because you
want this 200 task to be boosted to 280. One can argue that not all tasks at
200 need to be boosted to 280 too. So the question is, like above, what type
of tasks that are failing here and how do you observe this failure? It seems
there's a class of performance critical tasks that need this fast boosting.
Can't you identify them and boost them individually?

There's nothing that prevents you to change the uclamp_min of the cgroup
dynamically by the way. Like for instance when an app launches you can choose
a high boost value then lower it once it started up. Or if you know the top-app
is a game and you want to guarantee a good minimum performance for it; you
can choose to increase the top-app uclamp_min value too in a special gaming
mode or something.

For best perf/watt, using the per-task API is the best way forward. But
I understand it'll take time for apps/android framework to learn how to use the
per-task API most effectively. But it is what we should be aiming for.

Cheers

--
Qais Yousef

>
> >
> > Uclamp_min is meant to be the final `boost( = util + margin)`
> > information. You just have to set it appropriately to the task (via
> > per-task and/or per-cgroup interface).
>
> As said above, it is difficult to set the per-cgroup's uclamp_min for
> all tasks in Android sometimes.
>
> >
> > Uclamp_min is for `boosting`, Uclamp max is for `capping` CPU frequency.
>
> Yes!
>
> >
>
> Thanks!
> xuewen