Re: State of linux checkpointing?

From: Tim Connors
Date: Wed Apr 28 2004 - 18:19:40 EST


Jeff Garzik <jgarzik@xxxxxxxxx> said on Wed, 28 Apr 2004 16:23:00 -0400:
> Neal D. Becker wrote:
> > I wonder if there is a checkpointing that will work with 2.6 kernels?
> >
> > I only need relatively basic checkpointing. No sockets or fancy stuff.
>
> You only need checkpointing when your application programmers are lazy
> and don't care about data integrity. :)

Or you are running some kind of cluster where you want the
applications to be checkpointed transparently without the application
knowing the details of how or when they will be swapped out (but this
will need sockets anyway, so won't happen anytime soon).

'Tis a pain that the alpha cluster here can suspend long running jobs
for a pile of smaller jobs, and then resume, but the linux cluster can
do no such fanciness (yes, we do manual checkpointing, but it's prone
to bugs - and finding such a bug after 30 days of compute time really
sucks balls).


--
TimC -- http://astronomy.swin.edu.au/staff/tconnors/
Beware of Programmers who carry screwdrivers.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/