Re: [RFC] generalise scheduling classes

From: Nick Piggin
Date: Sun Nov 23 2003 - 21:28:10 EST




Anton Blanchard wrote:

We still don't have an HT aware scheduler, which is unfortunate because
weird stuff like that looks like it will only become more common in future.


Yep. Look at POWER5, 2 cores on a die sharing a l2 cache and 2 threads
on each core. On top of that you have the higher level NUMA
characteristics of the machine. So we need SMT as well as (potentially)
2 levels of NUMA. The overhead of enabling multi levels of NUMA may
outweigh the gains, we need to do some analysis.


Technically the scheduler knows nothing about NUMA. Previously it had
local and a remote domains corresponding to inter and intra node cpu sets.
All it did was to do remote balancing a little more gently. But we'll call
it NUMA scheduling.

What you want for POWER5 is very aggressive sharing at the SMT level and
possibly even the chip level if they share l2. Less aggressive for node
local and then even less for remote.

SGI I think have differing distances between NUMA nodes and they expressed
possible interest in a multi level system.

I can't give you good benchmark numbers because I only have the NUMAQ at
OSDL to test on - its only got 2 levels anyway. I should think that
overheads are quite minor considering it is in slow paths (balancing).


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/