Re: SMP scalability: 8 -> 32 CPUs

Jason Riedy (
Thu, 03 Dec 1998 19:22:30 -0500

Oh well. And Jes Sorensen writes:
- I am not sure I understand you right here, on the O2000 you can either
- run the whole box as a single image or I _think_ partition it and run
- multiple instances of IRIX on it (I think IRIX 6.5 allows this at
- least, maybe I'm wrong).

ASCI Blue Mountain is a cluster of O2000s with 6144 total processors. I
believe it's set up to appear as a cluster of single machine images (1k
processors), each of which is a NUMA, but I don't have access to it to
check. I don't know if the cluster follows the same design as the
individual machines (fuzzy hypercubes).

BTW, ASCI Blue Pacific, the 1344-processor IBM SP/2 version, is a cluster
of 8-way SMPs (iirc). It has a pretty horrible memory latency problem
inside a single node (it's as fast as talking to another node), but the
_idea_ is to have a lot of really fast local memory, plus a trusted, fast
connection to other SMPs.

Origins are clusters of dual-processor SMPs techically, where the
connecting links are funky page maps. Writing code for a NUMA used to
seem silly (you really can't treat them as SMPs, so you write as if it's
a distributed box, at least in my area), but recent results with TLBs
even in uniprocessor systems make the needed optimizations seem like good
ideas in general.

Unfortunately this was gone way far afield of the Linux kernel.

Some perspective from me (just a grad student in parallel linear algeabra)
and what I've gleaned from others.. I'll be quiet on non-kernel stuff
after this.

Linux's primary supported architecture (x86) doesn't have enough memory
bandwidth to support high-performance clusters of SMPs, or at least not in
many areas. The main Alphas being used with Linux are single-processors
and try to suck memory through a stirring straw. The UltraSparcs are
definitely good for this, but if I'd most likely be using Solaris on any
large installation. And the multi-processor boxes are _not_ inexpensive.

If you want to build a HPC cluster with Linux and x86 / inexpensive Alphas,
the way to go right now is lots of UP nodes. In the next year or so, SMP
x86-architecture machines should gain usable memory bandwidth while keeping
costs down. I haven't seen any indication of serious > 8 processor work.
AMD & classic Digital probably won't shoot beyond 8, either. Sun's fastest
processors currently (or at least last time I heard) can't run in their
boxes that fit more than 14 processors, and their clock speed is slower, so
that gives you an idea of the problems coming up fast.

Hence, the current target as just right for the majority of inexpensive
clusters for probably the next two years.

While it'd be nice to have Linux run efficiently on really big SMPs, it
doesn't seem practical with the tools in use and the SMP limitations coming
up due to increasing clock speeds. There are better places to spend time
than going beyond eight processors, at least until people solve more hardware
interconnect problems.

Jason, sounding like I know more than I do... ;)

To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to
Please read the FAQ at