Re: [PATCH 1/2] sched/uclamp: Add a new sysctl to control RT default boost value

From: Qais Yousef
Date: Tue May 12 2020 - 07:47:06 EST


On 05/12/20 07:40, Pavan Kondeti wrote:
> On Mon, May 11, 2020 at 04:40:52PM +0100, Qais Yousef wrote:
> > RT tasks by default run at the highest capacity/performance level. When
> > uclamp is selected this default behavior is retained by enforcing the
> > requested uclamp.min (p->uclamp_req[UCLAMP_MIN]) of the RT tasks to be
> > uclamp_none(UCLAMP_MAX), which is SCHED_CAPACITY_SCALE; the maximum
> > value.
> >
> > This is also referred to as 'the default boost value of RT tasks'.
> >
> > See commit 1a00d999971c ("sched/uclamp: Set default clamps for RT tasks").
> >
> > On battery powered devices, it is desired to control this default
> > (currently hardcoded) behavior at runtime to reduce energy consumed by
> > RT tasks.
> >
> > For example, a mobile device manufacturer where big.LITTLE architecture
> > is dominant, the performance of the little cores varies across SoCs, and
> > on high end ones the big cores could be too power hungry.
> >
> > Given the diversity of SoCs, the new knob allows manufactures to tune
> > the best performance/power for RT tasks for the particular hardware they
> > run on.
> >
> > They could opt to further tune the value when the user selects
> > a different power saving mode or when the device is actively charging.
> >
> > The runtime aspect of it further helps in creating a single kernel image
> > that can be run on multiple devices that require different tuning.
> >
> > Keep in mind that a lot of RT tasks in the system are created by the
> > kernel. On Android for instance I can see over 50 RT tasks, only
> > a handful of which created by the Android framework.
> >
> > To control the default behavior globally by system admins and device
> > integrators, introduce the new sysctl_sched_uclamp_util_min_rt_default
> > to change the default boost value of the RT tasks.
> >
> > I anticipate this to be mostly in the form of modifying the init script
> > of a particular device.
> >
> > Whenever the new default changes, it'd be applied lazily on the next
> > opportunity the scheduler needs to calculate the effective uclamp.min
> > value for the task, assuming that it still uses the system default value
> > and not a user applied one.
> >
> > Tested on Juno-r2 in combination with the RT capacity awareness [1].
> > By default an RT task will go to the highest capacity CPU and run at the
> > maximum frequency, which is particularly energy inefficient on high end
> > mobile devices because the biggest core[s] are 'huge' and power hungry.
> >
> > With this patch the RT task can be controlled to run anywhere by
> > default, and doesn't cause the frequency to be maximum all the time.
> > Yet any task that really needs to be boosted can easily escape this
> > default behavior by modifying its requested uclamp.min value
> > (p->uclamp_req[UCLAMP_MIN]) via sched_setattr() syscall.
> >
> > [1] 804d402fb6f6: ("sched/rt: Make RT capacity-aware")
> >
>
> I have tested this patch on SDM845 running V5.7-rc4 and it works as expected.
>
> Default: i.e /proc/sys/kernel/sched_util_clamp_min_rt_default = 1024.
>
> RT task runs on BIG cluster every time at max frequency. Both effective
> and requested uclamp.min are set to 1024
>
> With /proc/sys/kernel/sched_util_clamp_min_rt_default = 128
>
> RT task runs on Little cluster (max capacity is 404) and frequency scaling
> happens as per the change in utilization. Both effective and requested
> uclamp are set to 128.
>
> Feel free to add
>
> Tested-by: Pavankumar Kondeti <pkondeti@xxxxxxxxxxxxxx>

Thanks Pavan!

--
Qais Yousef