Re: [PATCH v2 1/2] fork: extend clone3() to support CLONE_SET_TID

From: Adrian Reber
Date: Fri Aug 02 2019 - 03:25:21 EST


On Wed, Jul 31, 2019 at 07:41:36PM +0200, Oleg Nesterov wrote:
> On 07/31, Adrian Reber wrote:
> >
> > Extending clone3() to support CLONE_SET_TID makes it possible restore a
> > process using CRIU without accessing /proc/sys/kernel/ns_last_pid and
> > race free (as long as the desired PID/TID is available).
>
> I personally like this... but please see the question below.
>
> > +struct pid *alloc_pid(struct pid_namespace *ns, int set_tid)
> > {
> > struct pid *pid;
> > enum pid_type type;
> > @@ -186,12 +186,28 @@ struct pid *alloc_pid(struct pid_namespace *ns)
> > if (idr_get_cursor(&tmp->idr) > RESERVED_PIDS)
> > pid_min = RESERVED_PIDS;
> >
> > - /*
> > - * Store a null pointer so find_pid_ns does not find
> > - * a partially initialized PID (see below).
> > - */
> > - nr = idr_alloc_cyclic(&tmp->idr, NULL, pid_min,
> > - pid_max, GFP_ATOMIC);
> > + if (set_tid) {
> > + /*
> > + * Also fail if a PID != 1 is requested
> > + * and no PID 1 exists.
> > + */
> > + if ((set_tid >= pid_max) || ((set_tid != 1) &&
> > + (idr_get_cursor(&tmp->idr) <= 1)))
> ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
>
> Ah, I forgot to mention... this should work but only because
> RESERVED_PIDS > 0. How about idr_is_empty() ?
>
>
> But the main question is how it can really help if ns->level > 0, unlikely
> CRIU will ever need to clone the process with the same pid_nr == set_tid
> in the ns->parent chain.

Not sure I understand what you mean. For CRIU only the PID in the PID
namespace is relevant.

> So may be kernel_clone_args->set_tid should be pid_t __user *set_tid_array?
> Or I missed something ?

Not sure why and how an array would be needed. Could you give me some
more details why you think this is needed.

Adrian