Re: [RFC] Simple NUMA scheduler patch

From: Erich Focht (efocht@ess.nec.de)
Date: Mon Oct 14 2002 - 12:19:03 EST

Next message: Jari Ruusu: "Re: Loop on top of NFS hangs kernel"
Previous message: Austin Gonyou: "Re: [linux-lvm] Re: [PATCH] 2.5 version of device mapper submissi on"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

Hi Michael,

On Tuesday 08 October 2002 01:37, Michael Hohnbaum wrote:
> > I'll post some numbers comparing O(1), pooling scheduler, node affine
> > scheduler and RSS based affinity in a separate email. That should help
> > to decide on the direction we should move.
>
> One other piece to factor in is the workload characteristics - I'm
> guessing that Azusa is beng used more for scientific workloads which
> tend to be a bit more static and consumes large memory bandwidth.

Yes, I agree. We're trying to keep HPC tasks on their nodes because we
KNOW that they are memory bandwidth and latency hungry. Therefore I
believe HPC like jobs are good benchmarks. And easier to set up than
database tests (which can also demand very high bandwidths).

> > machine. For me that means the maximum memory bandwidth available for
> > each task, which you only get if you distribute the tasks equally among
> > the nodes.
>
> Depends on the type of job. Some actually benefit from being on the
> same node as other tasks as locality is more important than bandwidth.
> I am seeing some of this - when I get better distribution of load across
> nodes, performance goes down for sdet and kernbench.

This is a difficult issue. Because we're trying to get higher
performance out of a multitude of benchmarks (ever tried AIM7? It never
execs... so good bye initial balancing). But we also try to find a
solution which is good for any kind of NUMA machine. Currently we
experiment on the very different architectures:
- Azusa : remote/local latency ratio 1.6, but no node level cache
- NUMAQ : remote/local latency ratio 20, additional node level cache.

My current approach to the tuning parameters for adapting to the
machine is over the steal delays. Yours is over the load imbalance
needed to trigger a steal from a remote node. Maybe we'll even need more
buttons to be able to make these fit to any NUMA machine...

Regards,
Erich

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

Next message: Jari Ruusu: "Re: Loop on top of NFS hangs kernel"
Previous message: Austin Gonyou: "Re: [linux-lvm] Re: [PATCH] 2.5 version of device mapper submissi on"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

This archive was generated by hypermail 2b29 : Tue Oct 15 2002 - 22:00:50 EST