Re: [RESEND RFC PATCH V3] sched: Improve scalability of select_idle_sibling using SMT balance

From: Mike Galbraith
Date: Fri Feb 02 2018 - 22:55:54 EST


On Fri, 2018-02-02 at 13:34 -0500, Steven Sistare wrote:
> On 2/2/2018 12:39 PM, Steven Sistare wrote:
> > On 2/2/2018 12:21 PM, Peter Zijlstra wrote:
> >> On Fri, Feb 02, 2018 at 11:53:40AM -0500, Steven Sistare wrote:
> >>> It might be interesting to add a tunable for the number of random choices to
> >>> make, and clamp it at the max nr computed from avg_cost in select_idle_cpu.
> >>
> >> This needs a fairly complicated PRNG for it would need to visit each
> >> possible CPU once before looping. A LFSR does that, but requires 2^n-1
> >> elements and we have topology masks that don't match that.. The trivial
> >> example is something with 6 cores.
> >
> > Or keep it simple and accept the possibility of choosing the same candidate
> > more than once.
> >
> >>> Or, choose a random starting point and then search for nr sequential
> >>> candidates; possibly limited by a tunable.
> >>
> >> And this is basically what we already do. Except with the task-cpu
> >> instead of a per-cpu rotor.
> >
> > Righto. Disregard this suggestion.
>
> Actually, I take back my take back. I suspect the primary benefit
> of random selection is that it breaks up resonance states where
> CPUs that are busy tend to stay busy, and CPUs that are idle tend
> to stay idle, which is reinforced by starting the search at target =
> last cpu ran.

I suspect the primary benefit is reduction of bouncing. The absolutely
maddening thing about SIS is that some stuff out there (like FB's load)
doesn't give a rats ass about anything other than absolute minimum
sched latency while other stuff notices cache going missing.  Joy.

-Mike