Re: [PATCH 1/2] sched: fix and clean up calculate_imbalance

From: Peter Zijlstra
Date: Tue Jul 29 2014 - 11:50:10 EST

On Tue, Jul 29, 2014 at 11:15:49AM -0400, Rik van Riel wrote:

> > Right, so we want that code for overloaded -> overloaded migrations
> > such as not to cause idle cpus in an attempt to balance things.
> > Idle cpus are worse than imbalance.
> >
> > But in case of overloaded/imb -> !overloaded migrations we can
> > allow it, and in fact want to allow it in order to balance idle
> > cpus.
> In case the destination is over the average load, or the source is under
> the average load, fix_small_imbalance() determines env->imbalance.
> The "load_above_capacity" calculation is only reached when busiest is
> busier than average, and the destination is under the average load.
> In that case, env->imbalance ends up as the minimum of busiest - avg
> and avg - target.
> Is there any case where limiting it further to "load - capacity" from
> the busiest domain makes a difference?

sadly yes; suppose 8 cpus in 2 groups and 9 tasks, 8 tasks of weight 10,
1 of 1024. The local group will have 5 tasks of 10, the busiest will
have the remaining 4.

The sd avg is 138, local avg is 12, busiest avg is 263.

This gives: busiest-avg = 122, avg - local = 110

So an imbalance of 110.

Without limiting it further, we would migrate all 3 10 tasks over to
local and run 3 cpus idle.

Now running all 8 10 tasks on a single cpu and the 1 1024 task on
another and keeping 6 cpus idle is the 'fairest' solution, but that's
not the only goal, we also try and be work-conserving, iow. keep as many
cpus busy as possible.
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at
Please read the FAQ at