Re: Help in DSM design

From: Albert D. Cahalan (acahalan@cs.uml.edu)
Date: Sat Mar 04 2000 - 23:03:44 EST


Richard Gooch writes:
> Albert D. Cahalan writes:
>> Richard Gooch writes:
>>> Albert D. Cahalan writes:
>>>> Richard Gooch writes:

> No. My point is that while *some* people will be aware that their
> threaded programme run on DSM will have to be re-coded to avoid lots
> of lock movement, *most* authors of threaded programmes will be
> unaware of this. DSM discourages people from thinking.

The lucky will get by without a rewrite. Why deny them a chance?
The lack of choice you advocate doesn't allow very much thought.

>> Some people have existing software. They want to port it with
>> minimal effort, since CPU time is cheaper than BRAIN time.
>
> Why buy a 200 node cluster when you can get the same performance
> with a 20 CPU SMP box?

Ability to use the hardware for normal stuff, parts availability,
and of course expense. Where do you get commodity 20-way SMP????

The cheapest 20-way SMP system I could find was a Sun box with
400-MHz SPARC processors and 20 GB RAM. It was $564000 w/o drives.
These overgrown systems suffer from mysterious problems BTW.
(I use a Sun single-point-of-failure often enough to know.)

Penguin Computing: Two 550 MHz Pentium III, 256 MB, 13.6 GB for $2115
Penguin Computing: AMD Athlon 650 MHz, 256 MB, 13.6 GB for $1815
No-name: AMD Athlon 600, 128 MB, 8.4 GB, "Cool Linux Keyboard" for $939
Microway: 533 MHz Alpha 21164, 64 MB, 6.5 GB for $1995

Hmmm, 564 k$ for SMP. Roughly 200 to 400 k$ for the cluster.
Not that this isn't an insane comparison... hope you like Solaris.

Oh yeah... should you decide you really do need message passing,
you can do it on the 200-node cluster.

>> Some people want to prototype on normal clusters, then run the
>> code on fancy hardware. If message passing is slower on the
>> fancy hardware, then prototyping with messages is stupid.
>
> Message passing can't be slower. Even with your automatic DMA
> registers (from the private email you sent), the message passing call
> can simply do a memcpy to the DMA memory area.

Now you have the CPU doing extra work, wasting memory and trashing
the cache. You also lost some ease of use.

BTW, some people like to do MPI on this hardware. This is stupid
because it has lower performance, but people get the choice.
People with a clue use explicit DMA for large transfers only.

>>> If you mean within a single, (hardware) shared memory computer, then
>>> yes. Otherwise, no. And DSM is all about pretending you have shared
>>> memory across a network of computers.
>>
>> It all depends on how you define "shared" and "network" I suppose,
>> but I'm certainly not using SMP hardware or TCP/IP. I can define
>> physical address spaces that, when accessed, cause automatic data
>> movement with hardware routing at hundreds of megabytes/second.
>> (note BYTES not BITS) I think DSM would be perfect here.
>
> Very nice. What's the latency, though?

This all depends on the exact hardware version, and I doubt I'm
allowed to tell you the latest numbers. Generally though, latency
is lower for shared memory. It depends on distance and contention.

> Can you ship a lock in << 100 ns? Is there a penalty for doing
> two back-to-back? If you write, is the data broadcast to all
> CPUs/local memory pools? Is the MESI cache protocol supported?

Don't think I can tell you. (this is coherent memory BTW)
Of course not.
No.
Don't think I can tell you.

> I'm not saying that DSM cannot ever be a useful thing. I'm sure that
> there will be some applications that will scale fairly well, so you
> could say that the reduction in brain power is worth the small
> performance loss.
>
> What I am saying is that the vast majority of people who jump up and
> down saying that DSM will be great for their threaded application are
> wrong. If they think their performance won't suck, it's more likely
> than not that they haven't thought hard enough about it. The problem
> is not that DSM will kill performance in the rest of the kernel (I'm
> assuming it won't, otherwise it would never get past Linus, and would
> be howled down by me, Larry and plenty of others, I'm sure). The
> problem is that DSM is too seductive and easy to use, *when it's
> almost certain it shouldn't be used*.
>
> DSM should have a huge red health warning label telling people that
> they really don't want it.

Sure, the man page can explain this.
"hack used to port SMP code to clusters"
"not recommended for new code due to performance problems"

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Tue Mar 07 2000 - 21:00:17 EST