Re: [RFC PATCH 1/9] sched,cgroup: Add interface for latency-nice

From: Qais Yousef
Date: Thu Sep 05 2019 - 07:13:53 EST


On 09/05/19 12:46, Peter Zijlstra wrote:
> On Thu, Sep 05, 2019 at 10:45:27AM +0100, Patrick Bellasi wrote:
>
> > > From just reading the above, I would expect it to have the range
> > > [-20,19] just like normal nice. Apparently this is not so.
> >
> > Regarding the range for the latency-nice values, I guess we have two
> > options:
> >
> > - [-20..19], which makes it similar to priorities
> > downside: we quite likely end up with a kernel space representation
> > which does not match the user-space one, e.g. look at
> > task_struct::prio.
> >
> > - [0..1024], which makes it more similar to a "percentage"
> >
> > Being latency-nice a new concept, we are not constrained by POSIX and
> > IMHO the [0..1024] scale is a better fit.
> >
> > That will translate into:
> >
> > latency-nice=0 : default (current mainline) behaviour, all "biasing"
> > policies are disabled and we wakeup up as fast as possible
> >
> > latency-nice=1024 : maximum niceness, where for example we can imaging
> > to turn switch a CFS task to be SCHED_IDLE?
>
> There's a few things wrong there; I really feel that if we call it nice,
> it should be like nice. Otherwise we should call it latency-bias and not
> have the association with nice to confuse people.
>
> Secondly; the default should be in the middle of the range. Naturally
> this would be a signed range like nice [-(x+1),x] for some x. but if you
> want [0,1024], then the default really should be 512, but personally I
> like 0 better as a default, in which case we need negative numbers.
>
> This is important because we want to be able to bias towards less
> importance to (tail) latency as well as more importantance to (tail)
> latency.
>
> Specifically, Oracle wants to sacrifice (some) latency for throughput.
> Facebook OTOH seems to want to sacrifice (some) throughput for latency.

Another use case I'm considering is using latency-nice to prefer an idle CPU if
latency-nice is set otherwise go for the most energy efficient CPU.

Ie: sacrifice (some) energy for latency.

The way I see interpreting latency-nice here as a binary switch. But maybe we
can use the range to select what (some) energy to sacrifice mean here. Hmmm.

--
Qais Yousef