Re: Core dumps & restarting

Christos Ricudis (ricudis@paiko.the.forthnet.gr)
Tue, 29 Oct 1996 17:52:20 +0200


> Why not dump the core ram image to another "machine", drop
> reservations on all the SCSI devices you are talking to, and then tell
> the machine "mount my disks, assume my ip addresses, and act like me,
> because I'm going down". It can work with something like a 3 minute

Look at University of Wisconsin's condor. http://www.cs.wisc.edu/condor. It
does process checkpoint and process migration, entirely in user space, and
as portably as it can be done. I've worked on a linux port for a while, but
eventually time constraints forced me to almost abandon it. Fortunately,
they have somebody else working on the Linux port. One of these weeks I'm
going to download the source tree and start working on it again....

Christos Ricudis
ricudis@paiko.the.forthnet.gr