Re: [PATCH] pid: cosmetic changes to alloc_pid()

From: Oleg Nesterov

Date: Sat Mar 07 2026 - 04:24:06 EST

On 03/06, Mateusz Guzik wrote:
>
> Commit 6d864a1b182532e7 ("pid: only take pidmap_lock once on alloc")
> landed v2 of the patch instead of v3. This patch remedies the problem.
>
> No functional changes.
>
> Signed-off-by: Mateusz Guzik <mjguzik@xxxxxxxxx>

The patch looks obviously good, but it conflicts with Pavel's

[PATCH v4 0/4] pid_namespace: make init creation more flexible
https://lore.kernel.org/all/20260225133229.550302-1-ptikhomirov@xxxxxxxxxxxxx/#t

In particular, this change:

- for (tmp = ns, i = ns->level; i >= 0; i--) {
+ for (tmp = ns, i = ns->level; i >= 0;) {

With 2/4 from Pavel we need to decrement "i" before the ->child_reaper
check.

Pavel, it seems you need to resend your series anyway, may be on top
of this patch, I dunno. And probably this all should be routed via
-mm tree?

Oleg.

> ---
> kernel/pid.c | 19 +++++++++----------
> 1 file changed, 9 insertions(+), 10 deletions(-)
>
> diff --git a/kernel/pid.c b/kernel/pid.c
> index 2f1dbcbc2349..dbe82062e683 100644
> --- a/kernel/pid.c
> +++ b/kernel/pid.c
> @@ -177,7 +177,7 @@ struct pid *alloc_pid(struct pid_namespace *ns, pid_t *arg_set_tid,
> * for a process in all nested PID namespaces but arg_set_tid_size must
> * never be greater than the current ns->level + 1.
> */
> - if (arg_set_tid_size > ns->level + 1)
> + if (unlikely(arg_set_tid_size > ns->level + 1))
> return ERR_PTR(-EINVAL);
>
> /*
> @@ -186,7 +186,7 @@ struct pid *alloc_pid(struct pid_namespace *ns, pid_t *arg_set_tid,
> * 1. allocate and fill in pid struct
> */
> pid = kmem_cache_alloc(ns->pid_cachep, GFP_KERNEL);
> - if (!pid)
> + if (unlikely(!pid))
> return ERR_PTR(retval);
>
> get_pid_ns(ns);
> @@ -205,7 +205,7 @@ struct pid *alloc_pid(struct pid_namespace *ns, pid_t *arg_set_tid,
> * This stores found pid_max to make sure the used value is the same should
> * later code need it.
> */
> - for (tmp = ns, i = ns->level; i >= 0; i--) {
> + for (tmp = ns, i = ns->level; i >= 0;) {
> pid_max[ns->level - i] = READ_ONCE(tmp->pid_max);
>
> if (arg_set_tid_size) {
> @@ -227,6 +227,7 @@ struct pid *alloc_pid(struct pid_namespace *ns, pid_t *arg_set_tid,
> }
>
> tmp = tmp->parent;
> + i--;
> }
>
> /*
> @@ -247,10 +248,9 @@ struct pid *alloc_pid(struct pid_namespace *ns, pid_t *arg_set_tid,
> tid + 1, GFP_ATOMIC);
> /*
> * If ENOSPC is returned it means that the PID is
> - * alreay in use. Return EEXIST in that case.
> + * already in use. Return EEXIST in that case.
> */
> if (nr == -ENOSPC)
> -
> nr = -EEXIST;
> } else {
> int pid_min = 1;
> @@ -276,12 +276,11 @@ struct pid *alloc_pid(struct pid_namespace *ns, pid_t *arg_set_tid,
> * Preload more memory if idr_alloc{,cyclic} failed with -ENOMEM.
> *
> * The IDR API only allows us to preload memory for one call, while we may end
> - * up doing several under pidmap_lock with GFP_ATOMIC. The situation may be
> - * salvageable with GFP_KERNEL. But make sure to not loop indefinitely if preload
> - * did not help (the routine unfortunately returns void, so we have no idea
> - * if it got anywhere).
> + * up doing several with GFP_ATOMIC. It may be the situation is salvageable with
> + * GFP_KERNEL. But make sure to not loop indefinitely if preload did not help
> + * (the routine unfortunately returns void, so we have no idea if it got anywhere).
> *
> - * The lock can be safely dropped and picked up as historically pid allocation
> + * The pidmap lock can be safely dropped and picked up as historically pid allocation
> * for different namespaces was *not* atomic -- we try to hold on to it the
> * entire time only for performance reasons.
> */
> --
> 2.48.1
>