Re: [RFC] [PATCH 00/13] Introduce task_pid api

From: Ray Bryant
Date: Tue Nov 15 2005 - 15:31:34 EST


On Tuesday 15 November 2005 13:41, Serge E. Hallyn wrote:
> Quoting Ray Bryant (raybry@xxxxxxxxxxxxxxxxx):
> > On Monday 14 November 2005 15:23, Serge E. Hallyn wrote:
> > > --
> > >
> > > I'm part of a project implementing checkpoint/restart processes.
> > > After a process or group of processes is checkpointed, killed, and
> > > restarted, the changing of pids could confuse them. There are many
> > > other such issues, but we wanted to start with pids.
> >
> > I've read through the rest of this thread, but it seems to me that the
> > real problems are in the basic assumptions you are making that are
> > driving the rest of this effort and perhaps we should be examining those
> > assumptions instead of your patch.
>
> Ok.
>
> > For example, from what I've read (particularly Hubertus's post that the
> > pid could be in a register), I'm inferring that what you want to do is to
> > be able to checkpoint/restart an arbitrary process at an arbitrary time
> > and without any special support for checkpoint/restart in that process.
>
> Yes.
>
> > Also (c. f. Dave Hansen's post on the number of Xen virtual machines
> > required), you appear to think that the number of processes on the
> > system for which checkpoint/restart should be enabled is large (more or
> > less the same as the number of processes on the system).
>
> Right.
>
> > Am I reading this correctly?
>
> As far as I can see, yes.
>
> -serge

Personally, I think that these assumptions are incorrect for a
checkpoint/restart facility. I think that:

(1) It is really only possible to checkpoint/restart a cooperative process.
For this to work with uncooperative processes you have to figure out (for
example) how to save and restore the file system state. (e. g. how do you
get the file position set correctly for an open file in the restored program
instance?) And this doesn't even consider what to do with open network
connections.

Similarly, what does one do about the content of System V shared memory
regions or the contents of System V semaphores? I'm sure there are many
more such problems we can come up with a careful study of the Linux/Unix API.

(Note that "cooperation" in this context can also mean "willing to run inside
of a container of some kind that supports checkpoint/restart".)

So you can probably only checkpoint the process at certain points in its
lifetime, points which the application should be willing to identify in some
way. And I would argue that at such points in time, you can require that
the current register state doesn't include the results of a system call such
as getpid(), couldn't you?

(2) Checkpoint/Restart really only makes sense for a long running, resource
intensive job. (e. g. for a job that is doing a lot of work and hence, for
which, recovery is hard -- perhaps as hard as re-running the entire job).
By their very nature, there are probably only a few such jobs running on the
system. If there are lots of such jobs on the system, then re-running each
one can't be that hard, can it?

So, I guess my question is wrt the task_pid API is the following: Given that
there are a lot of other problems to solve before transparent checkpointing
of uncooperative processes is possible, why should this partial solution be
accepted into the main line kernel and "set in stone" so to speak?

Don't get me wrong, I would love for there to be a commonly accepted
checkpoint/restart API. But I don't think that this can be done
transparently at the kernel level and without some cooperation from the
target task.
--
Ray Bryant
AMD Performance Labs Austin, Tx
512-602-0038 (o) 512-507-7807 (c)

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/