Re: [PATCH] Load balancing problem in 2.6.2-mm1

From: Rick Lindsley
Date: Fri Feb 06 2004 - 18:15:56 EST


OK, but do you agree that the rate we rebalance things like 2 vs 1 should
be slower than the rate we rebalance 3 vs 1 ? Fairness is only relevant
over a long term imbalance anyway, so there should be a big damper on
"fairness only" rebalances.

I think, given the precision we're granted via SCHED_LOAD_SCALE, in
combination with the new "load average" (cpu_load) code, that we can
achieve what we want.

If cpu0 has 2 runnable tasks and cpu1 has 1 runnable task, won't we see
the "load average" of cpu0 slowly approach 2, but not jump there?

Right now, we round up on all fractions and Martin has proposed a patch
which takes it the other way and rounds down. What if in marginal
cases like this where this is a small but persistent difference, we
could bump the task to another cpu when it reaches (say) 1.8 or 1.9?
That would keep it there longer for shorter-lived tasks, but for those
long-runners, they'd eventually spread the pain around a little.

And yes, a cpu_load of even 1.0 should *never* get migrated to a cpu
with a load 0.0. Instead of

*imbalance = (*imbalance + SCHED_LOAD_SCALE - 1) >> SCHED_LOAD_SHIFT;

how about, for instance,

if (max_load <= SCHED_LOAD_SCALE)
*imbalance = 0;
else
*imbalance = (*imbalance + (SCHED_LOAD_SCALE / 6) - 1)
>> SCHED_LOAD_SHIFT;

The intent is to never move anything if max_load is 1 or less (what
advantage is there?) and to create a slight tendency to round up at
loads greater than that, which would still tend to leave things where
they were until they'd been there a while. In fact the "bonus"
(SCHED_LOAD_SCALE / 6 - 1) could be another configurable in the scheduling
domain so that at some level you're not interested in fairness and
they just don't bounce at all.

Rick
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/