Re: Remote fork() and Parallel Programming

Andrej Presern (andrejp@luz.fe.uni-lj.si)
Sat, 13 Jun 1998 12:59:50 +0200


ralf@uni-koblenz.de wrote:
> Think of issues like a filepointer for a file which has grown since the
> process snapshot was taken / started to be moved. Assume it was pointing
> to the last byte and the file has grown. So now, where to position the
> file when relaunching the process?

Let's consider the scenario that you describe in a generally persistent
system.

Process A has some data that it works on and wants it saved. It calls
the filesystem object and tells it to write the data to a permanent
storage. The filesystem creates a new file object (in memory), copies
the data into the file (still in memory) and returns to the caller (no
writing yet), so that it can proceed with its work. This is a normal
buffering operation and we do things like this right now.

After some time, a system-wide snapshot kicks in. This means that all
pages that have been modified since the last snapshot are written to
permanent storage, including process A, the filesystem object, the file
buffer that was still in memory and all filesystem internal structures,
such as the file pointer for the process A's file.

Process A can now resume its work, reading and 'writing' to the file.

Now, disaster, a power failure strikes and the system goes down. But
save your worries - when the system comes up again, it reads in the last
saved snapshot and continues from that point, same as it did at the time
the snapshot was taken. The amount of work that has been lost is the
work that was done since the snapshot, which usually means a few minutes
of operation if the snapshot is done once in a few minutes (which is the
usual policy that can normally be run-time adjusted). The data of all
objects in the system however is consistent, including (but absolutely
not limited to) process A, the filesystem object, the file and all
internal file-related structures and pointers, such as the process A's
file pointer. (Carefull observers have probably noticed that if we have
general persistency, we don't really need a filesystem anymore (we may
want to keep the interfaces for compatibility reasons maybe), since the
filesystem object is just another object, whose data and state is saved
in the snapshot along with all other objects)

Mind you that this is a normal, transparent, system-wide shapshot /
continue operation in a NON-distributed environment (distributed systems
have a whole new set of problems to deal with, especially if we want
process migration (which IMHO is a very bad idea)).

Andrej

-- 
Andrej Presern, andrejp@luz.fe.uni-lj.si

- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.rutgers.edu