Re: [RFC v6][PATCH 0/9] Kernel based checkpoint/restart

From: Ingo Molnar
Date: Thu Oct 09 2008 - 09:44:55 EST



* Dave Hansen <dave@xxxxxxxxxxxxxxxxxx> wrote:

> On Thu, 2008-10-09 at 15:17 +0200, Ingo Molnar wrote:
> > * Dave Hansen <dave@xxxxxxxxxxxxxxxxxx> wrote
> > > On Thu, 2008-10-09 at 14:46 +0200, Ingo Molnar wrote:
> > > > i'm wondering about the following productization aspect: it would be
> > > > very useful to applications and users if they knew whether it is safe to
> > > > checkpoint a given app. I.e. whether that app has any state that cannot
> > > > be stored/restored yet.
> > >
> > > Absolutely!
> > >
> > > My first inclination was to do this at checkpoint time: detect and
> > > tell users why an app or container can't actually be checkpointed.
> > > But, if I get you right, you're talking about something that happens
> > > more during the runtime of the app than during the checkpoint. This
> > > sounds like a wonderful approach to me, and much better than what I
> > > was thinking of.
> > >
> > > What kind of mechanism do you have in mind?
> > >
> > > int sys_remap_file_pages(...)
> > > {
> > > ...
> > > oh_crap_we_dont_support_this_yet(current);
> > > }
> > >
> > > Then the oh_crap..() function sets a task flag or something?
> >
> > yeah, something like that. A key aspect of it is that is has to be very
> > low-key on the source code level - we dont want to sprinkle the kernel
> > with anything ugly. Perhaps something pretty explicit:
> >
> > current->flags |= PF_NOCR;
>
> Am I miscounting, or are we out of these suckers on 32-bit platforms?

We've still got a few holes: you can pick 0x00000020, 0x00000080,
0x00004000, 0x08000000.

> > as we do the same thing today for certain facilities:
> >
> > current->flags |= PF_NOFREEZE;
> >
> > you probably want to hide it behind:
> >
> > set_current_nocr();
>
> Yeah, that all looks reasonable. Letting this be a dynamic thing
> where you can move back and forth between the two states would make a
> lot of sense too. But, for now, I guess it can be a one-way trip.

there might be races as well, especially with proxy state - and
current->flags updates are not serialized.

So maybe it should be a completely separate flag after all? Stick it
into the end of task_struct perhaps.

Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/