Re: Remote fork() and Parallel Programming

Larry McVoy (lm@bitmover.com)
Thu, 11 Jun 1998 18:02:57 -0700


: >Getting back to your comments. The reason I think they aren't that technical
: >is that you are arguing for a bunch of features which /sound/ great.
:
: They sound great because they are great. I wonder how experience could be of
: use to you here.

"People who do not learn from history are doomed to repeat it".

: >Again, please be more specific. What you are doing is called "armchair
: >architecture". You /think/ that these applications exist and would use
: >this facility. I used to think just like you until I went out and waded
: >through the real applications that can and do make use of clusters. They
: >are nothing like what you are imagining.
:
: As I see things, distributed programming should not be (very) different from
: programming a single computer. [etc].

Again, please be more specific. I've asked over and over for specific
examples of how your system would be used and you always reply with
"it should work this way". I'm not interested in how it should work
in your mind. I'm interested in how your model works for real applications.
You are assuming that there are such applications. Yes, there are. But
no real applications fit your model even slightly. Go look at the Vax Cluster
documentation, at Parallel Oracle, Sabre, MPI, PVM, Locus, Pratt & Whitney
simulation cluster, all the national labs, the financial trading systems,
to mention a few of the more important ones.

Not one of 'em gives a damn about remote fork. Not one. If you were to
stop and think about it for a while, you might realize what a horrible
failure model remote fork() would carry with it. What are you going to
do when your cluster full of rforked() processes loses a node? And that
mode happens to have one page of data that they are all sharing.

For clusters, shared memory sucks, fork() sucks. Clusters need strong
boundries between the processes, and strong failure models, since the
probability of a node failing goes to 1 as the number of nodes goes up.
It's absolutely unacceptable to have you application crash because one
node went away. How does remote fork() work on that system? How does
it make sense?

: I hope this
: explains my insistance on there being many applications who could benefit.

Look, I can insist that Cindy Crawford is going to walk into my house and
make my life one everlasting period of bliss. I can keep insisting it.
But if I expect other people to believe it, then I have to produce proof.
I can't. Much to my chagrin, Cindy has never heard of me.

Your only answer when questioned is to repeatedly insist it will be a good
thing and you never answer any specific questions. How do you expect to
lead people down a path if you don't have good answers for even the obvious
questions?

: BTW, Why do you consider some things that usually come in advanced text
: books on operating systems as imaginary? If people before you and I were more
: willing to incorporate the more modern concepts in their designs, then maybe
: things would be more real.

"In theory, theory and practice are the same.
In practice, they are different."

--me circa 1993 or so

Text books are great for theory. I'm much more interested in practice.

: >Wrong answer. Complexity in the base of the system is a bad idea. There are
: >good reasons that Unix has a "simple" design.
:
: It is very hard to define simplicity and complexity. For example, I have a
: hard time considering a monolithic OS design as simple. Most UNIX variants
: are just a jungle of code (they could have been simple once). They are
: complicated by design.

No, they aren't. An operating system is made up of a collection of objects,
i.e., devices, files, file systems, processes, sockets. You can think of
these objects as C++ classes with all virtual methods; otherwise known as
a set of interfaces whee each instance of the class implements all the
methods of that class. All operating systems work this way. The generic
portion of the code that lives above the objects is very small and simple,
go look at Linux. The generic process and memory management code is
14K lines. Tiny. All the code is in the instances of the various objects,
mostly in dirvers and then networking and file systems making up the rest.

But the generic part is very, very small. And as simple as it can be.

What you have been proposing to add is all code that would have to live
in the generic part of the system. The things you are contemplating would
completely dwarf the rest of the generic code.

: But I understand if the design seems simple to the people familiar with it.
: We are more willing to accept things we have known and used, and reject
: things that are not very familiar. The danger is when we try to impose our
: interpretations of simplicity and complexity on others.

"An architect is someone who understands
the difference between what could be done
and what should be done"

--me again circa 1992, in the midst of cluster debates

I applaud your ability to realize what could be done. I am fully aware
that what you are propsing could be done. I think it is a bad idea, I
think ideas like yours have been proposed, tried, and failed repeatedly.
I think you got the first half right, now you need to spend some time
wonder why that pain-in-the-ass-McVoy is so insistant that this is
something that should not be done.

By the way - I have zero control over what you or anyone else does. You
can win the whole debate by showing up with the code. But so far, you are
just suggesting this as an idea. If we're just talking ideas, I'm quite
happy to match my experience and other people's experiences, much of
it documented, against your ideas. The reason for that is that there are
a lot of people that read this list quietly, think about what is
discussed, and act accordingly. I'm hoping that they will think about
this topic and try a new approach, not the same old thing that has
failed over and over in the past.

: >Funny you should mention these applications. Have you ever talked to one
: >of the people that work on systems like this? Have you ever spent time
: >understanding the basic issues? How do you propose to have 2 applications
: >share a cluster at the same time without thrashing? There's only one
: >answer and it is gang scheduling. Think about gang scheduling for a
: >while and then try and explain to me how process migration is going to
: >not completely screw up the scheduling.
:
: If needed, the (gang) scheduler can "lock" a process in a machine for any
: duration.

OK, that's half of the answer. Now please answer the second half. How
does process migration fit in with gang scheduling again?

: And BTW, doesn't a flexible checkpoint/restart mechanism (something the
: programmer can use _any_time_ in his application) require solving most of
: the problems you mention?

Checkpoint/restart is a heavyweight operation. It's way too heavy to be used
in the way you are imagining. Yes, it does inolve some of the features you
want, but it doesn't involve any of the hard migration issues. For example,
if a process has an open socket on node A and is migrated to node B, what
do you do with the socket?

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu