Re: 2.6.4-mm1

From: Nick Piggin
Date: Thu Mar 11 2004 - 23:26:34 EST




Andi Kleen wrote:

On Thu, Mar 11, 2004 at 07:04:50PM -0800, Nakajima, Jun wrote:

As we can have more complex architectures in the future, the scheduler
is flexible enough to represent various scheduling domains effectively,
and yet keeps the common scheduler code simple.


I think for SMT alone it's too complex and for NUMA it doesn't do
the right thing for "modern NUMAs" (where NUMA factor is very low
and you have a small number of CPUs for each node).



For SMT it is a less complex than shared runqueues, it is actually
less lines of code and smaller object size.

It is also more flexible than shared runqueues in that you can still
have control over each sibling's runqueue. Con's SMT nice patch for
example would probably be more difficult to do with shared runqueues.
Shared runqueues also gives zero affinity to siblings. While current
implementations may not (do they?) care, future ones might.

For Opteron type NUMA, it actually balances much more aggressively
than the default NUMA scheduler, especially when a CPU is idle. I
don't doubt you aren't seeing great performance, but it should be
able to be fixed.

The problem is just presumably your lack of time to investigate
further, and my lack of problem descriptions or Opterons.

One thing you definitely want is a sched_balance_fork, is that right?
Have you been able to do any benchmarks on recent -mm kernels?

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/