Re: [PATCH] fanotify: report thread pidfds for FAN_REPORT_TID

From: Christian Brauner

Date: Thu May 28 2026 - 07:56:14 EST


On 2026-05-24 18:24 +0800, AnonymeMeow wrote:
> The FAN_REPORT_PIDFD and FAN_REPORT_TID flags used to be mutually
> exclusive because by the time the pidfd support was introduced to
> fanotify, pidfds could only be created for thread group leaders. Now
> that the pidfd API supports thread-specific pidfds via PIDFD_THREAD,
> this restriction can be lifted.
>
> This patch allows listeners using FAN_REPORT_PIDFD | FAN_REPORT_TID
> to receive the pidfd referring to the thread that triggered the
> event.
>
> Signed-off-by: AnonymeMeow <anonymemeow@xxxxxxxxx>
> ---
> fs/notify/fanotify/fanotify_user.c | 27 ++++++++-------------------
> 1 file changed, 8 insertions(+), 19 deletions(-)
>
> diff --git a/fs/notify/fanotify/fanotify_user.c b/fs/notify/fanotify/fanotify_user.c
> index ae904451dfc0..ebdd48942029 100644
> --- a/fs/notify/fanotify/fanotify_user.c
> +++ b/fs/notify/fanotify/fanotify_user.c
> @@ -19,6 +19,7 @@
> #include <linux/memcontrol.h>
> #include <linux/statfs.h>
> #include <linux/exportfs.h>
> +#include <linux/pidfd.h>
>
> #include <asm/ioctls.h>
>
> @@ -903,25 +904,21 @@ static ssize_t copy_event_to_user(struct fsnotify_group *group,
> metadata.fd = fd >= 0 ? fd : FAN_NOFD;
>
> if (pidfd_mode) {
> - /*
> - * Complain if the FAN_REPORT_PIDFD and FAN_REPORT_TID mutual
> - * exclusion is ever lifted. At the time of incoporating pidfd
> - * support within fanotify, the pidfd API only supported the
> - * creation of pidfds for thread-group leaders.
> - */
> - WARN_ON_ONCE(FAN_GROUP_FLAG(group, FAN_REPORT_TID));
> + unsigned int tid_mode = FAN_GROUP_FLAG(group, FAN_REPORT_TID);
> + enum pid_type pidtype = tid_mode ? PIDTYPE_PID : PIDTYPE_TGID;
> + unsigned int pidfd_flags = tid_mode ? PIDFD_THREAD : 0;
>
> /*
> - * The PIDTYPE_TGID check for an event->pid is performed
> + * The pid_has_task() check for an event->pid is performed
> * preemptively in an attempt to catch out cases where the event
> - * listener reads events after the event generating process has
> + * listener reads events after the event generating task has
> * already terminated. Depending on flag FAN_REPORT_FD_ERROR,
> * report either -ESRCH or FAN_NOPIDFD to the event listener in
> * those cases with all other pidfd creation errors reported as
> * the error code itself or as FAN_EPIDFD.
> */
> - if (metadata.pid && pid_has_task(event->pid, PIDTYPE_TGID))
> - pidfd = pidfd_prepare(event->pid, 0, &pidfd_file);
> + if (metadata.pid && pid_has_task(event->pid, pidtype))

For quite a while the kernel refused to hand out pidfds for reaped
processes even if the struct pid was pinned like in this case.

But that makes various APIs - including this one - way less powerful
than they can be. Nowadays the socket layer already hands out pidfds for
reaped processes. It also stashed the struct pid. Let's do the same
here.

Drop the pid_has_task() change and then:

pidfd = pidfd_prepare(event->pid, pidfd_flags | PIDFD_STALE, &pidfd_file);

which instructs pidfs to and out a pidfd even if the task has already
been reaped. Reaped pidfds can still be queried for various types of
information that is kept around even if the task is long gone.


> + pidfd = pidfd_prepare(event->pid, pidfd_flags, &pidfd_file);
>
> if (!FAN_GROUP_FLAG(group, FAN_REPORT_FD_ERROR) && pidfd < 0)
> pidfd = pidfd == -ESRCH ? FAN_NOPIDFD : FAN_EPIDFD;
> @@ -1628,14 +1625,6 @@ SYSCALL_DEFINE2(fanotify_init, unsigned int, flags, unsigned int, event_f_flags)
> #endif
> return -EINVAL;
>
> - /*
> - * A pidfd can only be returned for a thread-group leader; thus
> - * FAN_REPORT_PIDFD and FAN_REPORT_TID need to remain mutually
> - * exclusive.
> - */
> - if ((flags & FAN_REPORT_PIDFD) && (flags & FAN_REPORT_TID))
> - return -EINVAL;
> -
> /* Don't allow mixing mnt events with inode events for now */
> if (flags & FAN_REPORT_MNT) {
> if (class != FAN_CLASS_NOTIF)
> --
> 2.54.0
>
>