Re: [PATCH v5] pidfd: add ioctl to retrieve pid info

From: Oleg Nesterov
Date: Sun Oct 06 2024 - 13:22:28 EST


On 10/06, luca.boccassi@xxxxxxxxx wrote:
>
> +static long pidfd_info(struct task_struct *task, unsigned int cmd, unsigned long arg)
> +{
> + struct pidfd_info __user *uinfo = (struct pidfd_info __user *)arg;
> + size_t usize = _IOC_SIZE(cmd);
> + struct pidfd_info kinfo = {};
> + struct user_namespace *user_ns;
> + const struct cred *c;
> + __u64 request_mask;
> +
> + if (!uinfo)
> + return -EINVAL;
> + if (usize < sizeof(struct pidfd_info))
> + return -EINVAL; /* First version, no smaller struct possible */
> +
> + if (copy_from_user(&request_mask, &uinfo->request_mask, sizeof(request_mask)))
> + return -EFAULT;
> +
> + c = get_task_cred(task);
> + if (!c)
> + return -ESRCH;
> +
> + /* Unconditionally return identifiers and credentials, the rest only on request */
> +
> + kinfo.pid = task_pid_vnr(task);
> + kinfo.tgid = task_tgid_vnr(task);
> + kinfo.ppid = task_ppid_nr_ns(task, task_active_pid_ns(task));
^^^^^^^^^^^^^^^^^^^^^^^^
The same problem as with "info.pid = pid_nr_ns(pid, task_active_pid_ns(task));"
you used before. You should use the caller's namespace, not the task's namespace.

kinfo.ppid = task_ppid_nr_ns(task, NULL);

should work, see __task_pid_nr_ns() which uses task_active_pid_ns(current) if
ns == NULL.

> + /*
> + * One last check before returning: the task might have exited while we
> + * were fetching all the data, so check again to be safe. */
> + if (task_pid_vnr(task) == 0)
> + return -ESRCH;

Well, this looks strange. It would be better to kill "kinfo.pid = task_pid_vnr()"
above and do

kinfo.pid = task_pid_vnr(task);
if (!kinfo.pid)
return -ESRCH;

at the end, but this is minor.

I don't think we can rely on this check.

Suppose that pidfd_info() runs on CPU_0 and it races with __unhash_process(task)
on CPU_1 which does

detach_pid(p, PIDTYPE_PID);
detach_pid(p, PIDTYPE_TGID);

Without the barries/locking CPU_0 can see the changes above out of order,
so it is possible that pidfd_info() will see task_pid_vnr(task) != 0, but
report kinfo.tgid == 0.

Oleg.