Re: [PATCH v3 1/2] fork: extend clone3() to support CLONE_SET_TID

From: Christian Brauner
Date: Wed Aug 07 2019 - 14:05:56 EST


On Wed, Aug 07, 2019 at 06:08:56PM +0200, Oleg Nesterov wrote:
> On 08/06, Adrian Reber wrote:
> >
> > @@ -2573,6 +2575,14 @@ noinline static int copy_clone_args_from_user(struct kernel_clone_args *kargs,
> > .tls = args.tls,
> > };
> >
> > + if (size == sizeof(struct clone_args)) {
> > + /* Only check permissions if set_tid is actually set. */
> > + if (args.set_tid &&
> > + !ns_capable(pid_ns->user_ns, CAP_SYS_ADMIN))
>
> and I just noticed this uses pid_ns = task_active_pid_ns() ...
>
> is it correct?
>
> I feel I am totally confused, but should we use the same
> p->nsproxy->pid_ns_for_children passed to alloc_pid?

We need to have CAP_SYS_ADMIN in the owning user namespace of the target
pidns for the pidns in which we spawn the new process. The value for
pid_ns_for_children could've been altered by either passing CLONE_NEWPID
or by having called unshare(CLONE_NEWPID) before. So yes,
pid_ns_for_children is what we want.

Sorry again for the delay in my responses. On vacation atm.

Christian