Re: Remote fork() and Parallel Programming

mshar@vax.ipm.ac.ir
Wed, 10 Jun 1998 15:27:57 +0330


Hi,

lm@bitmover.com (Larry McVoy) wrote:

>: A remote fork() extension will be a very useful addition. This should be
>: followed by process migration and dynamic load balancing. Is anyone
>: working on these problems?
>
>People who think this is a good idea may not be exactly right [...]

This is a matter of debate; as I'll explain below.

>My personal take on these issues is:
>
> BAD GOOD
> --- ----
> remote fork() remote exec()
> process migration checkpoint / restart
> dynamic load balancing static load balancing across machines
> across machines + dynamic load balancing within
> (SMP) machines.

The items listed under "BAD" are more powerful that the "GOOD" ones:

*) remote fork() handles the run-time state of a process, while remote
exec() does not bother with that. Please keep in mind that the application
programmer will have a hard time duplicating the run-time state of a process
without support from the operating system.

*) process migration is more flexible and more transparent than an explicit
checkpoint / restart mechanism. dynamic load balancing and process migration
should come together. The combination of these two eases the optimum use
of the available computing resources, without bothering the programmer.

After all, the operating system is the only entity that can be aware of the
current resource-usage situation in different computers. An application
program(mer) can not know this entire state. More important, it can not
in general predict the future, so in a static load balancing environment
the reaction to changes in resource-usage will not be satisfactory.

What I want to say is that as an all-knowking authority, the operating
system is in a good position to use mechanisms like remote fork() and dynamic
load balancing for the good of _all_ the running application programs.

Now we come to more practical points: It may be possible to simulate the
"BAD" mechanisms with the "GOOD" ones (actually, some one implementing the
"BAD" mechanisms inside the operating system might very well do so, as the
"GOOD" ones are more primitive), but I am not sure if it is a good idea
to let application programmers be faced by such issues. One should remember
that application programmers have other things to worry about (like the
problems that their work is supposed to solve in the first place).

>Yup, there are limitations with my point of view. But it is easy to
>implement what I'm describing and it sloves most problems easily. Some
>problems need to be recoded.

What I prefer is doing all the recodings inside the kernel! I'd rather
see a system where the programmers could write their code very similar to
the ways they do now, and the system would then handle things for them.
This will also need minimal recoding: The prgrammer will always call a
fork(), and the OS will decide if the system call will be executed
locally or remotely. It will handle all the communications if parts of a
program are running in different computers. The system can also migrate
processes to balance the cluster's resource-usage. All the "GOOD" mechanisms,
on the other hand, will ultimately require some work on the part of the
application programmers.

Yes, implementing this won't be easy, but it will be done only once, and
then a large number of application programmers can use them.

>Given that clusters are not anywhere near the common case (no matter how
>much you and I like them), it would seem prudent to do the lesser amount
>of change to the kernel that gains you most of the benefit [...]

The chicken-egg problem again :-) Many people will not start using clusters
while programming them is difficult! What you suggest might ensure that this
will remain so. I am sure you agree that we should better "change" the
situation rather than "cope" with it.

> [...] People who
>haven't learned this lesson have repeatedly built overly complex and
>expensive clusters. Let's not do that to Linux.

Linux uses a monolithic kernel. I myself don't think this is a good thing,
but we should make our efforts compatible with this design. Adding to the
kernel is not a very bad thing here. If something is added to the kernel in
a way compatible with the mechanisms it is extending, then the application
programers will benefit a lot.

In short: It might be better to add a few thousand lines of code to the
kernel and be spared the troubles of adding some few hundred lined of code
to _many_ application programs. I hope the net effect to be less lines of
code and much less bugs.

Maybe when the application programmers see that programming for a cluster
is not that different from their normal programming methods (while bringing
many advatages), more people will start using clusters.

I hope Mr. Linus Torvalds will have a positive attitude to such changes.
Without this, any effort to bring such kernel-level functionalities (like
DIPC) to Linux will have little impact. And I believe that if Linux is to
compete with the likes of Windows NT, then it better be more equipped with
these more-advanced mechanisms.

-Kamran Karimi

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu