Re: Help in DSM design

From: Albert D. Cahalan (acahalan@cs.uml.edu)
Date: Sat Mar 04 2000 - 01:03:54 EST


Richard Gooch writes:
> Albert D. Cahalan writes:
>> Richard Gooch writes:

>>> Ach! Not another DSM project :-( Don't do it. There's already a DSM
>>> implementation for Linux,
>>
>> Obviously this is a feature people want.
>
> Too many people think they want it because they don't really
> understand the alternatives and the implications.

No way. Their situation is just different from yours.

Some people have existing software. They want to port it with
minimal effort, since CPU time is cheaper than BRAIN time.

Some people want to prototype on normal clusters, then run the
code on fancy hardware. If message passing is slower on the
fancy hardware, then prototyping with messages is stupid.

>>> and besides, you're better off with a message-passing interface.
>>> That way application coders can see how costly operations are.
>>> Using DSM hides that, resulting in inefficient code.
>>
>> Message passing can be more costly! On the hardware I develop for, a
>> "message" involves setting up some DMA control data. Distributed
>> shared memory has a one-time setup cost, so it is faster for
>> frequent access to small bits of data.
>
> Message passing must be more efficient than DSM. We are talking about
> clusters, not a single SMP machine. The reason is that DSM *must* sit
> on top of a transport layer (aka message passing).

The world is not only clusters and SMP.

Even if it was, you could prototype on a cheap cluster before you
buy the 8-way SMP Alpha that you need for regular use.

>> You could really mess up performance by using a message-passing API
>> for repeated random access to 8-byte values. Actually, I think the
>> break-even point is near 2 kB.
>
> If you mean within a single, (hardware) shared memory computer, then
> yes. Otherwise, no. And DSM is all about pretending you have shared
> memory across a network of computers.

It all depends on how you define "shared" and "network" I suppose,
but I'm certainly not using SMP hardware or TCP/IP. I can define
physical address spaces that, when accessed, cause automatic data
movement with hardware routing at hundreds of megabytes/second.
(note BYTES not BITS) I think DSM would be perfect here.

This kind of hardware doesn't come cheap. You could save a huge
amount of money by doing development on a cluster, but you'd need
DSM to be able to run your code.

Going the other way, code written for the fancy hardware could
be grabbed for less demanding needs on a cluster. Maybe you will
need switches instead of dumb hubs... oh well, that is likely less
costly than a software rewrite.

Try to remember that not everyone has your hardware and needs.
I've looked at the Linux DSM code, so I know you won't be hurt
by it being in the kernel.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Tue Mar 07 2000 - 21:00:16 EST