Re: [PATCH 2/4] pid: add pidfd_open()

From: Jann Horn
Date: Wed Mar 27 2019 - 13:08:24 EST


On Wed, Mar 27, 2019 at 5:22 PM Christian Brauner <christian@xxxxxxxxxx> wrote:
> pidfd_open() allows to retrieve pidfds for processes and removes the
> dependency of pidfd on procfs. Multiple people have expressed a desire to
> do this even when pidfd_send_signal() was merged. It is even recorded in
[...]
> IF PROCFD_TO_PIDFD is passed as a flag together with a file descriptor to a
> /proc mount in a given pid namespace and a pidfd pidfd_open() will return a
> file descriptor to the corresponding /proc/<pid> directory in procfs
> mounts' pid namespace. pidfd_open() is very careful to verify that the pid

nit: s/mounts'/mount's/

> hasn't been recycled in between.
> IF PIDFD_TO_PROCFD is passed as a flag together with a file descriptor
> referencing a /proc/<pid> directory a pidfd referencing the struct pid
> stashed in /proc/<pid> of the process will be returned.

nit: s/of the process //?

> The pidfd_open() syscalls in that manner resembles openat() as it uses a

nit: s/syscalls/syscall/

[...]
> diff --git a/kernel/pid.c b/kernel/pid.c
> index 20881598bdfa..c9e24e726aba 100644
> --- a/kernel/pid.c
> +++ b/kernel/pid.c
[...]
> +static struct file *pidfd_open_proc_pid(const struct file *procf, pid_t pid,
> + const struct pid *pidfd_pid)
> +{
> + char name[11]; /* int to strlen + \0 */

nit: The comment is a bit off; an unconstrained int needs 1+10+1
bytes, I think? minus sign, 10 digits, nullbyte? But of course that
can't actually happen here.

> + struct file *file;
> + struct pid *proc_pid;
> +
> + snprintf(name, sizeof(name), "%d", pid);
> + file = file_open_root(procf->f_path.dentry, procf->f_path.mnt, name,
> + O_DIRECTORY | O_NOFOLLOW, 0);

Maybe explicitly write the implied O_RDONLY (which is 0) here for clarity?

[...]
> +static int pidfd_to_procfd(pid_t pid, int procfd, int pidfd)
> +{
> + long fd;
> + pid_t ns_pid;
> + struct fd fdproc, fdpid;
> + struct file *file = NULL;
> + struct pid *pidfd_pid = NULL;
> + struct pid_namespace *proc_pid_ns = NULL;
> +
> + fdproc = fdget(procfd);
> + if (!fdproc.file)
> + return -EBADF;
> +
> + fdpid = fdget(pidfd);
> + if (!fdpid.file) {
> + fdput(fdpid);

Typo: s/fdput(fdpid)/fdput(fdproc)/

[...]
> +SYSCALL_DEFINE4(pidfd_open, pid_t, pid, int, procfd, int, pidfd, unsigned int,
> + flags)
[...]
> + if (!flags) {
[...]
> + rcu_read_lock();
> + pidfd_pid = get_pid(find_pid_ns(pid, task_active_pid_ns(current)));
> + rcu_read_unlock();

The previous three lines are equivalent to `pidfd_pid = find_get_pid(pid)`.

> + fd = pidfd_create_fd(pidfd_pid, O_CLOEXEC);

Nit: You could hardcode O_CLOEXEC in pidfd_create_fd() and get rid of
the second function argument if you want to.

> + put_pid(pidfd_pid);
> + } else if (flags & PIDFD_TO_PROCFD) {
[...]
> + fd = pidfd_to_procfd(pid, procfd, pidfd);

The `pid` argument of pidfd_to_procfd() looks unused, maybe it makes
sense to get rid of that?