Re: 3.0 wishlist Was: Overview of 2.2.x goals?

Larry McVoy (lm@who.net)
Fri, 23 Jan 1998 10:23:20 -0800

Messages sorted by: [ date ][ thread ][ subject ][ author ]
Next message: Bill Hawes: "Re: NFS bug on 2.1.80"
Previous message: Linus Torvalds: "Re: Followup: copy_to_user return value breaks lots of code"
Maybe in reply to: linux kernel account: "3.0 wishlist Was: Overview of 2.2.x goals?"
Next in thread: Gordon Chaffee: "Re: NLS char set and fsck"

: Using message passing on SMP machines with _real_ shared memory is not
: very clever.

Want to bet? Using the /unmodified/ message passing libraries is stupid.
But what good vendors do is to provide the same interfaces, in a DLL, that
are optimized for SMP. So you do things like:

msg_send(from, to, ...)
{
if (SMP(from, to)) {
bcopy(get_addr(from), get_addr(to), length);
} else {
real_msg_send(from, to, ....);
}
}

Then it goes like blazes on an SMP while maintaining compatibility with
clusters. In fact, you can mix and match a cluster of SMPs just fine.

: This might be the trend, but programming message passing systems is not
: very easy. The main reason for their prevalence is better performace, which
: is the result of transfering the needed data (and only the needed data) at
: the right time to the right place.

The main reason for their prevalence is

. works everywhere
. same programming model no mater what your environment
. heterogeneous system
. ease of use

: >DIPC is cute but is likely to be ignored by the big apps.
:
: I don't know if this will be true, but DIPC could result in a new breed of
: software.

Sure it could but you were talking about is big apps. I have a fair bit of
experience with those apps and I can tell you that the war is way past over.
MPI won. For good reason.

: Some of its advantages are:
:
: *) Distributed shared memory is much easier to use.

True for trivial apps. False for real apps. Shared memory (distributed
or otherwise) is a difficult programming model for people to grasp.
Quick - how many people realize that coherent shared memory isn't -
all the state you care about is in the registers of the CPU. So when do
those get flushed? Is your model store ordered, partially store order?
Etc., etc. Try and learn from the various SMP kernel experiences - it's
really hard and takes quite talented people to get it right. Do you really
want to design a programming model that most people can't use? Seems quite
elitist to me.

: *) Programs using DIPC can be run in a single computer, even on Linux
: kernels without DIPC support! There is no need to modify and compile
: the sources to achieve this.

Ditto for MPI.

: *) DIPC programs can automatically use SMP hardware and real shared memory.
: Again, no need for the modification and recompilation of the sources.

Ditto for MPI, it's been this way for years.

: *) DIPC can work in heterogeneous environments (currently x86 and M68k).
: This is not very common among distributed shared memory systems.

So if I put an int in shared memory on a PC and read it on a SPARC it gets
byte swapped? I don't think so. You have to have an RPC like XDR layer to
do this and the kernel sure as hell is not going to do this for you. MPI
has this, of course.

: *) You don't have to learn some new programming interfaces.

Yes you do - you have to learn shared memory. Which is much, much harder for
people to grasp then you might think.

: *) Older programs using System V IPC can still be used.
:
: DIPC is a very safe system to use by the programmers. They don't have to
: learn a lot of new things, and their programming investment does not depend
: on DIPC's success or availability. This is a very big plus.

Look, don't get me wrong - I think DIPC is cute. But you shouldn't
expect the big apps people to get at all interested. It only works on
Linux, is not supported by SGI/SUN/HP/DEC/etc, and as such is completely
uninteresting to anyone who's /job/ it is to do the sort of programming
that you are discussing. People running those big apps are running on
supported machines. Linux isn't commercially supported.

That's not to say that this isn't a fun project. But it does throw into
question the real need for this stuff to be in the mainstream kernel. If
it isn't going to be used very much then it just becomes more lines of
code that can cause the kernel to have bugs. We don't want that, do we?

Next message: Bill Hawes: "Re: NFS bug on 2.1.80"
Previous message: Linus Torvalds: "Re: Followup: copy_to_user return value breaks lots of code"
Maybe in reply to: linux kernel account: "3.0 wishlist Was: Overview of 2.2.x goals?"
Next in thread: Gordon Chaffee: "Re: NLS char set and fsck"