Re: [PATCH] sched/rt: Add a new sysctl to control uclamp_util_min

From: Qais Yousef
Date: Thu Jan 09 2020 - 06:36:33 EST


On 01/08/20 09:51, Quentin Perret wrote:
> On Tuesday 07 Jan 2020 at 20:30:36 (+0100), Dietmar Eggemann wrote:
> > On 07/01/2020 14:42, Quentin Perret wrote:
> > > Hi Qais,
> > >
> > > On Friday 20 Dec 2019 at 16:48:38 (+0000), Qais Yousef wrote:
> >
> > [...]
> >
> > >> diff --git a/kernel/sched/rt.c b/kernel/sched/rt.c
> > >> index e591d40fd645..19572dfc175b 100644
> > >> --- a/kernel/sched/rt.c
> > >> +++ b/kernel/sched/rt.c
> > >> @@ -2147,6 +2147,12 @@ static void pull_rt_task(struct rq *this_rq)
> > >> */
> > >> static void task_woken_rt(struct rq *rq, struct task_struct *p)
> > >> {
> > >> + /*
> > >> + * When sysctl_sched_rt_uclamp_util_min value is changed by the user,
> > >> + * we apply any new value on the next wakeup, which is here.
> > >> + */
> > >> + uclamp_rt_sync_default_util_min(p);
> > >
> > > The task has already been enqueued and sugov has been called by then I
> > > think, so this is a bit late. You could do that in uclamp_rq_inc() maybe?
> >
> > That's probably better.
> > Just to be sure ...we want this feature (an existing rt task gets its
> > UCLAMP_MIN value set when the sysctl changes) because there could be rt
> > tasks running before the sysctl is set?
>
> Yeah, I was wondering the same thing, but I'd expect sysadmin to want
> this. We could change the min clamp of existing RT tasks in userspace
> instead, but given how simple Qais' lazy update code is, the in-kernel
> looks reasonable to me. No strong opinion, though.

The way I see this being used is set in init.rc. If any RT tasks were created
(most likely kthreads) before that they'll just be updated on the next
wakeup.

Of course this approach allows the value to change any point of time when the
system is running without having to do a reboot/recompile or kick a special
script/app to modify all existing RT tasks and continuously monitor new ones.

Another advantage is that apps that have special requirement (like professional
audio) can use the per-task uclamp API to bump their uclamp_min without
conflicting with the desired generic value for all other RT tasks.

IOW, we can easily at run time control the baseline performance for RT tasks
with a single knob without interfering with RT tasks that opt-in to modify
their own uclamp values.

--
Qais Yousef