Re: [PATCH 1/2] sched: fix and clean up calculate_imbalance

From: Peter Zijlstra
Date: Tue Jul 29 2014 - 11:27:15 EST


On Tue, Jul 29, 2014 at 04:59:10PM +0200, Peter Zijlstra wrote:
> On Tue, Jul 29, 2014 at 11:04:50AM +0200, Vincent Guittot wrote:
> > > In situations where all the domains are overloaded, or where only the
> > > busiest domain is overloaded, that code is also superfluous, since
> > > the normal env->imbalance calculation will figure out how much to move.
> > > Remove the load_above_capacity calculation.
> >
> > IMHO, we should not remove that part which is used by prefer_sibling
> >
> > Originally, we had 2 type of busiest group: overloaded or imbalanced.
> > You add a new one which has only a avg_load higher than other so you
> > should handle this new case and keep the other ones unchanged
>
> Right, so we want that code for overloaded -> overloaded migrations such
> as not to cause idle cpus in an attempt to balance things. Idle cpus are
> worse than imbalance.
>
> But in case of overloaded/imb -> !overloaded migrations we can allow it,
> and in fact want to allow it in order to balance idle cpus.

Which would be patch 3/2

---
Subject: sched,fair: Allow calculate_imbalance() to move idle cpus
From: Peter Zijlstra <peterz@xxxxxxxxxxxxx>
Date: Tue Jul 29 17:15:11 CEST 2014

Allow calculate_imbalance() to 'create' idle cpus in the busiest group
if there are idle cpus in the local group.

Suggested-by: Rik van Riel <riel@xxxxxxxxxx>
Signed-off-by: Peter Zijlstra <peterz@xxxxxxxxxxxxx>
Link: http://lkml.kernel.org/n/tip-7k95k4i2tjv78iivstggiude@xxxxxxxxxxxxxx
---
kernel/sched/fair.c | 11 +++++------
1 file changed, 5 insertions(+), 6 deletions(-)

--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -6273,12 +6273,11 @@ static inline void calculate_imbalance(s
return fix_small_imbalance(env, sds);
}

- if (busiest->group_type == group_overloaded) {
- /*
- * Don't want to pull so many tasks that a group would go idle.
- * Except of course for the group_imb case, since then we might
- * have to drop below capacity to reach cpu-load equilibrium.
- */
+ /*
+ * If there aren't any idle cpus, avoid creating some.
+ */
+ if (busiest->group_type == group_overloaded &&
+ local->group_type == group_overloaded) {
load_above_capacity =
(busiest->sum_nr_running - busiest->group_capacity_factor);

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/