Re: [question] sched: idle_avg and migration latency

From: Mike Galbraith
Date: Wed Dec 11 2013 - 01:44:22 EST


On Tue, 2013-12-10 at 19:31 +0100, Daniel Lezcano wrote:

> I think I am a bit puzzled with the 'idle_avg' name. I am guessing the
> semantic of this variable is "how long this cpu has been idle".

Average distance between idles.

> The idle duration, with the no_hz, could be long, several seconds if the
> work queues have been migrated and if the timer affinity is set to
> another cpu. So if we fall in this case and there is a burst of activity
> + micro-idle and idle_avg is not leverage to max, it will stay high
> during an amount of time, thus pulling tasks at each micro idle period,
> right ?

Yeah, it cares about shutting the thing down when idle distance is too
small to be affordable, but cranking is back up quickly as to not damage
generic bursty load utilization too much. It tries to be dirt simply
and cheap, not perfect.

For nohz_full loads, you'll likely want to kill most if not all wake and
idle balancing, or at least put some serious roadblocks up.. but then
you'll have isolated and pinned everything anyway if you deeply care
about perturbation. All load balancing totally sucks in that regard, as
do those darn workqueues you mentioned.

-Mike

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/