Re: [PATCH 03/16] sched/fair: Disregard idle task wakee_flips in wake_wide

From: Morten Rasmussen
Date: Mon May 23 2016 - 10:10:01 EST


On Mon, May 23, 2016 at 03:00:46PM +0200, Mike Galbraith wrote:
> On Mon, 2016-05-23 at 13:00 +0100, Morten Rasmussen wrote:
>
> > The problem then seems to be distinguishing truly idle and busy doing
> > interrupts. The issue that I observe is that wake_wide() likes pushing
> > tasks around in lightly scenarios which isn't desirable for power
> > management. Selecting the same cpu again may potentially let others
> > reach deeper C-state.
> >
> > With that in mind I will if I can do better. Suggestions are welcome :-)
>
> None here. For big boxen that are highly idle, you'd likely want to
> shut down nodes and consolidate load, but otoh, all that slows response
> to burst, which I hate. I prefer race to idle, let power gating do its
> job. If I had a server farm with enough capacity vs load variability
> to worry about, I suspect I'd become highly interested in routing.

I don't disagree for systems of that scale, but at the other end of the
spectrum it is a single SoC we are trying squeeze the best possible
mileage out of. That implies optimizing for power gating to reach deeper
C-states when possible by consolidating idle-time and grouping
idle cpus. Migrating task unnecessarily isn't helping us in achieving
that, unfortunately :-(