Re: [PATCH 1/2] sched: fix and clean up calculate_imbalance

From: Peter Zijlstra
Date: Tue Jul 29 2014 - 10:50:04 EST


On Mon, Jul 28, 2014 at 02:16:27PM -0400, riel@xxxxxxxxxx wrote:
> @@ -6221,16 +6221,16 @@ void fix_small_imbalance(struct lb_env *env, struct sd_lb_stats *sds)
> */
> static inline void calculate_imbalance(struct lb_env *env, struct sd_lb_stats *sds)
> {
> - unsigned long max_pull, load_above_capacity = ~0UL;
> struct sg_lb_stats *local, *busiest;
>
> local = &sds->local_stat;
> busiest = &sds->busiest_stat;
>
> - if (busiest->group_imb) {
> + if (busiest->avg_load <= sds->avg_load) {
> /*
> - * In the group_imb case we cannot rely on group-wide averages
> - * to ensure cpu-load equilibrium, look at wider averages. XXX
> + * Busiest got picked because it is overloaded or imbalanced,
> + * but does not have an above-average load. Look at wider
> + * averages.
> */
> busiest->load_per_task =
> min(busiest->load_per_task, sds->avg_load);

I don't think that's right, this code is really for imbalance only,
although I'm now wondering why (again)..

So currently the only other case is overloaded (since, as you noticed,
we don't balance for !overloaded) and that explicitly doesn't use it. So
making the overloaded case use this doesn't make sense.

> @@ -6247,32 +6247,15 @@ static inline void calculate_imbalance(struct lb_env *env, struct sd_lb_stats *s
> return fix_small_imbalance(env, sds);
> }
>
> - if (!busiest->group_imb) {
> - /*
> - * Don't want to pull so many tasks that a group would go idle.
> - * Except of course for the group_imb case, since then we might
> - * have to drop below capacity to reach cpu-load equilibrium.
> - */
> - load_above_capacity =
> - (busiest->sum_nr_running - busiest->group_capacity_factor);
> -
> - load_above_capacity *= (SCHED_LOAD_SCALE * SCHED_CAPACITY_SCALE);
> - load_above_capacity /= busiest->group_capacity;
> - }

I think we want to retain that, esp. for the overloaded case. So that
wants to be:

if (busiest->sum_nr_running > busiest->group_capacity_factor)

Clearly it doesn't make sense for the !overload case, and we explicitly
want to avoid it in the imb case.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/