Re: [PATCH] Load balancing problem in 2.6.2-mm1

From: Martin J. Bligh
Date: Fri Feb 06 2004 - 13:35:06 EST


>> This patch allows for this_load to set max_load, which if I understand
>> the logic properly is correct. It then adds a check to imbalance to make
>> sure a negative number hasn't been coerced into a large positive number.
>> With this patch applied, the algorithm is *much* more conservative ...
>> maybe *too* conservative but that's for another round of testing ...
>
> Good stuff, I just gave the patch a spin and things seem a little
> calmer. However Im still seeing a lot of balancing going on within a
> node.
>
> Setup:
>
> 2 threads per cpu.
> 2 nodes of 16 threads each.
>
> I ran a single "yes > /dev/null"
>
> And it looks like that process is bouncing around the entire node.
> Below is a 2 second average.

OK, what happens if you apply this ... does it fix it?
(not saying this is the correct solution, just debug)

M.

diff -purN -X /home/mbligh/.diff.exclude mm1/kernel/sched.c mm1-schedfix/kernel/sched.c
--- mm1/kernel/sched.c 2004-02-06 10:28:55.000000000 -0800
+++ mm1-schedfix/kernel/sched.c 2004-02-06 10:32:04.000000000 -0800
@@ -1440,15 +1440,11 @@ nextgroup:
*/
*imbalance = min(max_load - avg_load, avg_load - this_load);

- /* Get rid of the scaling factor now, rounding *up* as we divide */
- *imbalance = (*imbalance + SCHED_LOAD_SCALE - 1) >> SCHED_LOAD_SHIFT;
+ /* Get rid of the scaling factor now */
+ *imbalance = *imbalance >> SCHED_LOAD_SHIFT;

if (*imbalance == 0) {
- if (package_idle != NOT_IDLE && domain->flags & SD_FLAG_IDLE
- && max_load * busiest_nr_cpus > (3*SCHED_LOAD_SCALE/2))
- *imbalance = 1;
- else
- busiest = NULL;
+ busiest = NULL;
}

return busiest;

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/