Re: [RFC v6][PATCH 0/9] Kernel based checkpoint/restart

From: Oren Laadan
Date: Fri Oct 17 2008 - 03:10:31 EST




Peter Chubb wrote:
>>>>>> "Oren" == Oren Laadan <orenl@xxxxxxxxxxxxxxx> writes:
>
> Oren> Daniel Lezcano wrote:
>
>>>> The one exception (and it is a tedious one !) are states in which
>>>> the task is already frozen by definition: any ptrace blocking
>>>> point where the tracee waits for the tracer to grant permission to
>>>> proceed with its execution. Another example is in vfork(), waiting
>>>> for completion.
>>> I would say these are perfect places for "may be
>>> non-checkpointable" :)
>
> Oren> For now, yes. But we definitely want this capability in the long
> Oren> run; otherwise we won't be able to checkpoint a kernel compile
> Oren> ('make' uses vfork), or anything with 'gdb' running inside, or
> Oren> 'strace', and other goodies.
>
> The strace/gdb example is *really* hard; but for vfork, you just wait
> until it's over. The interval between vfork and exec/exit should be
> short enough not to affect the overall time for a checkpoint (and
> checkpoint can be fairly slow anyway --- on the HPC machines we used
> to do it on, writing half a terabyte of checkpoint image to disc could take
> many minutes. In hindsight, we should have multithreaded it).
> Waiting for a vforked process to exec is less than a millisecond.

Your observation is correct. On the other hand, it is fairly easy to
add the necessary glue for the vfork() case, and it's important to do
it because:
(a) as noted, a malicious user can exploit that.
(b) if you run 'make -j 32' you are likely to have an on-going vfork.
(c) vfork() is the easy case (compared to ptrace) and easy to solve.

Oren.

> --
> Dr Peter Chubb http://www.gelato.unsw.edu.au peterc AT gelato.unsw.edu.au
> http://www.ertos.nicta.com.au ERTOS within National ICT Australia
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/