Re: SMP Theory (was: Re: Interesting analysis of linux kernel threading by IBM)

From: David Lang (dlang@diginsite.com)
Date: Mon Jan 24 2000 - 21:33:31 EST


congradulations, you just described NUMA machines :-)

In a SMP box all CPU's have equal access to all memory. This does cause
the exact problems that you describe. One solution to this is to give each
CPU it's own memory, the problem then is that some parts of memory are
easier to get at then others, depending on which CPU you are accessing
them from. thus Non Uniform Memory Access machines (NUMA).

The headache then becomes memory allocation. You need to make sure that
when you allocate memory to a process you give it memory that is assigned
to the same CPU, otherwise the process has mush slower access to the
data.

If you can deal with that difficulty you can build machines that have
hundreds of CPUs easier then you can hit tens of CPUs on a SMP machine.

David Lang

On Mon, 24 Jan 2000 dg50@daimlerchrysler.com wrote:

> I've been reading the SMP thread and this is a truly educational and
> fascinating discussion. How SMP works, and how much of a benefit it
> provides has always been a bit of a mystery to me - and I think the light
> is slowly coming on.
>
> But I have a couple of (perhaps dumb) questions.
>
> OK, if you have an n-way SMP box, then you have n processors with n (local)
> caches sharing a single block of main system memory. If you then run a
> threaded program (like a renderer) with a thread per processor, you wind up
> with n threads all looking at a single block of shared memory - right?
>
> OK, if a thread accesses (I assume writes, reading isn't destructive, is
> it?) a memory location that another processor is "interested" in, then
> you've invalidated that processor's local cache - so it has to be flushed
> and refreshed. Have enough cross-talk between threads, and you can achieve
> the worst-case scenario where every memory access flushes the cache of
> every processor, totally defeating the purpose of the cache, and perhaps
> even adding nontrivial cache-flushing overhead.
>
> If this is indeed the case (please correct any misconceptions I have) then
> it strikes me that perhaps the hardware design of SMP is broken. That
> instead of sharing main memory, each processor should have it's own main
> memory. You connect the various main memory chunks to the "primary" CPU via
> some sort of very wide, very fast memory bus, and then when you spawn a
> thread, you instead do something more like a fork - copy the relevent
> process and data to the child cpu's private main memory (perhaps via some
> sort of blitter) over this bus, and then let that CPU go play in its own
> sandbox for a while.
>
> Which really is more like the "array of uni-processor boxen joined by a
> network" model than it is current SMP - just with a REALLY fast&wide
> network pipe that just happens to be in the same physical box.
>
> Comments? Please feel free to reply private-only if this is just too
> entry-level for general discussuion.
>
> DG
>
>
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.rutgers.edu
> Please read the FAQ at http://www.tux.org/lkml/
>

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Mon Jan 31 2000 - 21:00:13 EST