[PATCH RFC 0/2] pidfd: add CLONE_AUTOREAP
From: Christian Brauner
Date: Mon Feb 16 2026 - 08:49:20 EST
Add a new clone3() flag CLONE_AUTOREAP that makes a child process
auto-reap on exit without ever becoming a zombie. This is a per-process
property in contrast to the existing auto-reap mechanism via
SA_NOCLDWAIT or SIG_IGN for SIGCHLD which applies to all children of a
given parent.
With pidfds this is very useful as the parent can monitor the pidfd via
poll and retrieve the exit status from the pidfd.
Currently the only way to automatically reap children is to set
SA_NOCLDWAIT or SIG_IGN on SIGCHLD. This is a parent-scoped property
affecting all children which makes it unsuitable for libraries or
applications that need selective auto-reaping of specific children while
still being able to wait() on others.
CLONE_AUTOREAP stores an autoreap flag in the child's signal_struct.
When the child exits do_notify_parent() checks this flag and returns
autoreap=true causing exit_notify() to transition the task directly to
EXIT_DEAD. Since the flag lives on the child it survives reparenting: if
the original parent exits and the child is reparented to a subreaper or
init the child still auto-reaps when it eventually exits. This is
cleaner then forcing the subreaper to get SIGCHLD and then reaping it.
If the parent doesn't care the subreaper won't care. If there's a
subreaper that would care it would be easy enough to add a prctl() that
either just turns back on SIGCHLD and turns of auto-reaping or a prctl()
that just notifies the subreaper whenever a child is reparented to it.
CLONE_AUTOREAP requires CLONE_PIDFD because the process will never be
visible to wait(). The parent must use the pidfd to monitor exit via
poll() and retrieve exit status via PIDFD_GET_INFO. No exit signal is
delivered so exit_signal must be zero.
The flag is not inherited by the autoreap process's own children. Each
child that should be autoreaped must be explicitly created with
CLONE_AUTOREAP.
(Later on we can augment this with another addition CLONE_PIDFD_AUTOKILL
which would SIGKILL the child process when the pidfd that was returned
from clone3() is closed. Specifically, when the file referenced by the
fd from clone3() is closed. The wrinkly here is that it would either
have to be reset on privilege gaining exec - like pdeath signal - or we
enforce that autokill only works when no-new-privileges is set.)
Signed-off-by: Christian Brauner <brauner@xxxxxxxxxx>
---
Christian Brauner (2):
clone: add CLONE_AUTOREAP
selftests/pidfd: add CLONE_AUTOREAP tests
include/linux/sched/signal.h | 1 +
include/uapi/linux/sched.h | 1 +
kernel/fork.c | 16 +-
kernel/ptrace.c | 3 +-
kernel/signal.c | 4 +
tools/testing/selftests/pidfd/.gitignore | 1 +
tools/testing/selftests/pidfd/Makefile | 2 +-
.../testing/selftests/pidfd/pidfd_autoreap_test.c | 475 +++++++++++++++++++++
8 files changed, 500 insertions(+), 3 deletions(-)
---
base-commit: 72c395024dac5e215136cbff793455f065603b06
change-id: 20260214-work-pidfs-autoreap-3ee677e240a8