Re: [PATCH 1/2] pidns: guarantee that the pidns init will be thelast pidns process reaped

From: Pavel Emelyanov
Date: Thu May 31 2012 - 06:33:06 EST


On 05/30/2012 10:15 PM, Oleg Nesterov wrote:
> From: Eric W. Biederman <ebiederm@xxxxxxxxxxxx>
>
> Today we have a two-fold bug. Sometimes release_task on pid == 1 in a
> pid namespace can run before other processes in a pid namespace have had
> release task called. With the result that pid_ns_release_proc can be
> called before the last proc_flus_task() is done using
> upid->ns->proc_mnt, resulting in the use of a stale pointer. This same
> set of circumstances can lead to waitpid(...) returning for a processes
> started with clone(CLONE_NEWPID) before the every process in the pid
> namespace has actually exited.
>
> To fix this modify zap_pid_ns_processess wait until all other processes
> in the pid namespace have exited, even EXIT_DEAD zombies.
>
> The delay_group_leader and related tests ensure that the thread gruop
> leader will be the last thread of a process group to be reaped, or to
> become EXIT_DEAD and self reap. With the change to zap_pid_ns_processes
> we get the guarantee that pid == 1 in a pid namespace will be the last
> task that release_task is called on.
>
> With pid == 1 being the last task to pass through release_task
> pid_ns_release_proc can no longer be called too early nor can wait
> return before all of the EXIT_DEAD tasks in a pid namespace have exited.
>
> Signed-off-by: Eric W. Biederman <ebiederm@xxxxxxxxxxxx>
> Signed-off-by: Oleg Nesterov <oleg@xxxxxxxxxx>

Acked-by: Pavel Emelyanov <xemul@xxxxxxxxxxxxx>

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/