Re: [PATCH] sched/rt: Add a new sysctl to control uclamp_util_min

From: Peter Zijlstra
Date: Wed Jan 08 2020 - 08:45:08 EST


On Fri, Dec 20, 2019 at 04:48:38PM +0000, Qais Yousef wrote:
> RT tasks by default try to run at the highest capacity/performance
> level. When uclamp is selected this default behavior is retained by
> enforcing the uclamp_util_min of the RT tasks to be
> uclamp_none(UCLAMP_MAX), which is SCHED_CAPACITY_SCALE; the maximum
> value.
>
> See commit 1a00d999971c ("sched/uclamp: Set default clamps for RT tasks").
>
> On battery powered devices, this default behavior could consume more
> power, and it is desired to be able to tune it down. While uclamp allows
> tuning this by changing the uclamp_util_min of the individual tasks, but
> this is cumbersome and error prone.
>
> To control the default behavior globally by system admins and device
> integrators, introduce the new sysctl_sched_rt_uclamp_util_min to
> change the default uclamp_util_min value of the RT tasks.
>
> Whenever the new default changes, it'd be applied on the next wakeup of
> the RT task, assuming that it still uses the system default value and
> not a user applied one.

This is because these RT tasks are not in a cgroup or not affected by
cgroup settings? I feel the justification is a little thin here.

> If the uclamp_util_min of an RT task is 0, then the RT utilization of
> the rq is used to drive the frequency selection in schedutil for RT
> tasks.

Did cpu_uclamp_write() forget to check for input<0 ?