On Thursday 21 August 2008, Oren Laadan wrote:>
Using a single handle (crid or a special file descriptor) to identify
the whole checkpoint is very useful - to be able to stream it (eg. over
the network, or through filters). It is also very important for future
features and optimizations. For example, to reduce downtime of the
application during checkpoint, one can use COW for dirty pages, and
only write-back the entire data after the application resumes execution.
Or imagine a use-case where one would like to keep the entire checkpoint
in memory. These are pretty hard to do if you split the handling between
multiple files or handles.
right.
On the restart side, I think the most consistent interface wouldThis is an interesting idea but not without its problems. In particular,
be a new binfmt_chkpt implementation that you can use to execve
a checkpoint, just like you execute an ELF file today. The binfmt
can be a module (unlike a syscall), so an administrator that is
afraid of the security implications can just disable it by not
loading the module. In an execve model, the parent process can
set up anything related to credentials as good as it's allowed
to and then let the kernel do the rest.
a successful execve() by one thread destroys all the others.
Right, execve currently assumes that the new process starts up with
a single thread, but a potential binfmt_chkpt would need to potentially
start multithreaded. I guess this either requires execve to reuse
the existing threads (assuming they have been set up correctly in
advance) or to create new ones according to the context of the
checkpoint data. It may not be as easy as I thought initially, but
both seem possible.
Restarting a whole set of processes from a checkpoint would be
a relatively simple extension of that.
Also, it isn't clear how this can work with pre-copying and live-migration;
And finally, I'm not sure how to handle shared objects in this manner.
What do you mean with pre-copying?
How is live-migration different from restarting a previously saved
task from the same machine?
--
As for kernel module - it is easy to implement most of the checkpoint
restart functionality in a kernel module, leaving only the syscall stubs
in the kernel.
Yeah, I've done the same in spufs, but I still think it's ugly ;-)
Arnd <><