Re: [PATCH RFC v3 2/4] pidfd: add CLONE_PIDFD_AUTOKILL

From: Jann Horn

Date: Tue Feb 17 2026 - 18:44:54 EST


On Tue, Feb 17, 2026 at 11:36 PM Christian Brauner <brauner@xxxxxxxxxx> wrote:
> Add a new clone3() flag CLONE_PIDFD_AUTOKILL that ties a child's
> lifetime to the pidfd returned from clone3(). When the last reference to
> the struct file created by clone3() is closed the kernel sends SIGKILL
> to the child. A pidfd obtained via pidfd_open() for the same process
> does not keep the child alive and does not trigger autokill - only the
> specific struct file from clone3() has this property.
>
> This is useful for container runtimes, service managers, and sandboxed
> subprocess execution - any scenario where the child must die if the
> parent crashes or abandons the pidfd.

Idle thought, feel free to ignore:
In those scenarios, I guess what you'd ideally want would be a way to
kill the entire process hierarchy, not just the one process that was
spawned? Unless the process is anyway PID 1 of its own pid namespace.
But that would probably be more invasive and kind of an orthogonal
feature...

[...]
> +static int pidfs_file_release(struct inode *inode, struct file *file)
> +{
> + struct pid *pid = inode->i_private;
> + struct task_struct *task;
> +
> + guard(rcu)();
> + task = pid_task(pid, PIDTYPE_TGID);
> + if (task && READ_ONCE(task->signal->autokill_pidfd) == file)

Can you maybe also clear out the task->signal->autokill_pidfd pointer
here? It should be fine in practice either way, but theoretically,
with the current code, this equality check could wrongly match if the
actual autokill file has been released and a new pidfd file has been
reallocated at the same address... Of course, at worst that would kill
a task that has already been killed, so it wouldn't be particularly
bad, but still it's ugly.

> + do_send_sig_info(SIGKILL, SEND_SIG_PRIV, task, PIDTYPE_TGID);
> +
> + return 0;
> +}
[...]
> @@ -2470,8 +2479,11 @@ __latent_entropy struct task_struct *copy_process(
> syscall_tracepoint_update(p);
> write_unlock_irq(&tasklist_lock);
>
> - if (pidfile)
> + if (pidfile) {
> + if (clone_flags & CLONE_PIDFD_AUTOKILL)
> + p->signal->autokill_pidfd = pidfile;

WRITE_ONCE() to match the READ_ONCE() in pidfs_file_release()?

> fd_install(pidfd, pidfile);
> + }
>
> proc_fork_connector(p);
> sched_post_fork(p);