Re: [PATCH 11/11] pidns: Support unsharing the pid namespace.

From: Gao feng
Date: Tue Nov 20 2012 - 21:55:11 EST


on 2012/11/17 00:35, Eric W. Biederman wrote:
> From: "Eric W. Biederman" <ebiederm@xxxxxxxxxxxx>
>
> Unsharing of the pid namespace unlike unsharing of other namespaces
> does not take affect immediately. Instead it affects the children
> created with fork and clone. The first of these children becomes the init
> process of the new pid namespace, the rest become oddball children
> of pid 0. From the point of view of the new pid namespace the process
> that created it is pid 0, as it's pid does not map.
>
> A couple of different semantics were considered but this one was
> settled on because it is easy to implement and it is usable from
> pam modules. The core reasons for the existence of unshare.
>
> I took a survey of the callers of pam modules and the following
> appears to be a representative sample of their logic.
> {
> setup stuff include pam
> child = fork();
> if (!child) {
> setuid()
> exec /bin/bash
> }
> waitpid(child);
>
> pam and other cleanup
> }
>
> As you can see there is a fork to create the unprivileged user
> space process. Which means that the unprivileged user space
> process will appear as pid 1 in the new pid namespace. Further
> most login processes do not cope with extraneous children which
> means shifting the duty of reaping extraneous child process to
> the creator of those extraneous children makes the system more
> comprehensible.
>
> The practical reason for this set of pid namespace semantics is
> that it is simple to implement and verify they work correctly.
> Whereas an implementation that requres changing the struct
> pid on a process comes with a lot more races and pain. Not
> the least of which is that glibc caches getpid().
>
> These semantics are implemented by having two notions
> of the pid namespace of a proces. There is task_active_pid_ns
> which is the pid namspace the process was created with
> and the pid namespace that all pids are presented to
> that process in. The task_active_pid_ns is stored
> in the struct pid of the task.
>
> Then there is the pid namespace that will be used for children
> that pid namespace is stored in task->nsproxy->pid_ns.
>
> Signed-off-by: Eric W. Biederman <ebiederm@xxxxxxxxxxxx>
> ---

Acked-by: Gao feng <gaofeng@xxxxxxxxxxxxxx>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/