Re: [Lse-tech] [patch] sched-domain cleanups, sched-2.6.5-rc2-mm2-A3

From: Andi Kleen
Date: Thu Mar 25 2004 - 10:41:59 EST


On Thu, Mar 25, 2004 at 07:31:37AM -0800, Nakajima, Jun wrote:
> Andi,
>
> Can you be more specific with "it doesn't load balance threads
> aggressively enough"? Or what behavior of the base NUMA scheduler is
> missing in the sched-domain scheduler especially for NUMA?

It doesn't do load balance in wake_up_forked_process() and is relatively
non aggressive in balancing later. This leads to the multithreaded OpenMP
STREAM running its childs first on the same node as the original process
and allocating memory there. Then later they run on a different node when
the balancing finally happens, but generate cross traffic to the old node,
instead of using the memory bandwidth of their local nodes.

The difference is very visible, even the 4 thread STREAM only sees the
bandwidth of a single node. With a more aggressive scheduler you get
4 times as much.

Admittedly it's a bit of a stupid benchmark, but seems to representative
for a lot of HPC codes.

-Andi
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/