Re: [PATCH] sched/rt: Add a new sysctl to control uclamp_util_min

From: Peter Zijlstra
Date: Fri Jan 10 2020 - 08:40:26 EST


On Thu, Jan 09, 2020 at 01:00:58PM +0000, Qais Yousef wrote:
> On 01/08/20 14:44, Peter Zijlstra wrote:
> > On Fri, Dec 20, 2019 at 04:48:38PM +0000, Qais Yousef wrote:
> > > RT tasks by default try to run at the highest capacity/performance
> > > level. When uclamp is selected this default behavior is retained by
> > > enforcing the uclamp_util_min of the RT tasks to be
> > > uclamp_none(UCLAMP_MAX), which is SCHED_CAPACITY_SCALE; the maximum
> > > value.
> > >
> > > See commit 1a00d999971c ("sched/uclamp: Set default clamps for RT tasks").
> > >
> > > On battery powered devices, this default behavior could consume more
> > > power, and it is desired to be able to tune it down. While uclamp allows
> > > tuning this by changing the uclamp_util_min of the individual tasks, but
> > > this is cumbersome and error prone.
> > >
> > > To control the default behavior globally by system admins and device
> > > integrators, introduce the new sysctl_sched_rt_uclamp_util_min to
> > > change the default uclamp_util_min value of the RT tasks.
> > >
> > > Whenever the new default changes, it'd be applied on the next wakeup of
> > > the RT task, assuming that it still uses the system default value and
> > > not a user applied one.
> >
> > This is because these RT tasks are not in a cgroup or not affected by
> > cgroup settings? I feel the justification is a little thin here.
>
> The uclamp_min for RT tasks is always hardcoded to 1024 at the moment. So even
> if they belong to a cgroup->uclamp_min = 0, they'll still run at max frequency,
> no?

Argh, this is that counter intuitive max aggregate nonsense biting me.