Re: [PATCH v2] fanotify: report thread pidfds for FAN_REPORT_TID

From: Amir Goldstein

Date: Fri May 29 2026 - 03:28:33 EST


On Fri, May 29, 2026 at 4:01 AM AnonymeMeow <anonymemeow@xxxxxxxxx> wrote:
>
> The FAN_REPORT_PIDFD and FAN_REPORT_TID flags used to be mutually
> exclusive because by the time the pidfd support was introduced to
> fanotify, pidfds could only be created for thread group leaders. Now
> that the pidfd API supports thread-specific pidfds via PIDFD_THREAD,
> this restriction can be lifted.
>
> Also drop the pid_has_task() check to allow pidfds to be reported for
> reaped tasks as well.

I am not objecting to this change, but as a rule of thumb, almost
every time that
a commit message says "Also" it means that it should have been a separate
commit.

This is not an exception to the rule and even worse -
This change PIDFD_STALE is an independent change of behavior
(as observed by breakage of fanotify21) that could (maybe unlikely)
surprise real users.

We have few options:
1. Make it opt-in with yet another fanotify_init flag
2. Enable it implicitly together (and in the same patch) with FAN_REPORT_TID
3. Enable it without opt-in (as you did) but as separate patch
that will be reverted if someone complains on "real world"
(i.e. not test) breakage

IMO, the path of least resistance is option 2.
If users want to get the pidfd of tgid I suppose there should be no
problem to get it from the pidfd of the thread. Right?
So what FAN_REPORT_TID really means is give me a richer pidfd
with more information about the actor than without FAN_REPORT_TID.

I'd like to get Jan's input on those UAPI choices.

>
> Link: https://lore.kernel.org/lkml/20260528-schmuckvoll-heilen-garen-be77b4208671@brauner/
> Signed-off-by: AnonymeMeow <anonymemeow@xxxxxxxxx>
> ---
>
> On 2026-05-28 13:51 +0200, Christian Brauner wrote:
> > For quite a while the kernel refused to hand out pidfds for reaped
> > processes even if the struct pid was pinned like in this case.
> >
> > But that makes various APIs - including this one - way less powerful
> > than they can be. Nowadays the socket layer already hands out pidfds for
> > reaped processes. It also stashed the struct pid. Let's do the same
> > here.
> >
> > Drop the pid_has_task() change and then:
> >
> > pidfd = pidfd_prepare(event->pid, pidfd_flags | PIDFD_STALE, &pidfd_file);
> >
> > which instructs pidfs to and out a pidfd even if the task has already
> > been reaped. Reaped pidfds can still be queried for various types of
> > information that is kept around even if the task is long gone.
>
> Thanks for the review. I've updated this in v2 as you suggested.
>
> Also, the LTP fanotify21 test currently explicitly expects ESRCH or
> FAN_NOPIDFD for exited processes. I will update that test case accordingly.

This sentence highlights the problematic aspect of this change.
How would you update the test expectation?
The test must be able to run on upstream as well as stable kernels.
Would you test for support in FAN_REPORT_TID
and according to the result determine the expected behavior of the
test cases (variants) without FAN_REPORT_TID?
Very ugly IMO.

OTH, if you take the suggestion that PIDFD_STALE is implied only
for FAN_REPORT_TID, you can change the expectation for the
"terminated child" test case only for TST_VARIANT_PIDFD_THREAD.

Thanks,
Amir.

>
> With Best Regards,
> AnonymeMeow
>
> ---
> fs/notify/fanotify/fanotify_user.c | 33 +++++++-----------------------
> 1 file changed, 7 insertions(+), 26 deletions(-)
>
> diff --git a/fs/notify/fanotify/fanotify_user.c b/fs/notify/fanotify/fanotify_user.c
> index ae904451dfc0..b604e3da58ad 100644
> --- a/fs/notify/fanotify/fanotify_user.c
> +++ b/fs/notify/fanotify/fanotify_user.c
> @@ -19,6 +19,7 @@
> #include <linux/memcontrol.h>
> #include <linux/statfs.h>
> #include <linux/exportfs.h>
> +#include <linux/pidfd.h>
>
> #include <asm/ioctls.h>
>
> @@ -903,25 +904,13 @@ static ssize_t copy_event_to_user(struct fsnotify_group *group,
> metadata.fd = fd >= 0 ? fd : FAN_NOFD;
>
> if (pidfd_mode) {
> - /*
> - * Complain if the FAN_REPORT_PIDFD and FAN_REPORT_TID mutual
> - * exclusion is ever lifted. At the time of incoporating pidfd
> - * support within fanotify, the pidfd API only supported the
> - * creation of pidfds for thread-group leaders.
> - */
> - WARN_ON_ONCE(FAN_GROUP_FLAG(group, FAN_REPORT_TID));
> + unsigned int pidfd_flags = PIDFD_STALE;
>
> - /*
> - * The PIDTYPE_TGID check for an event->pid is performed
> - * preemptively in an attempt to catch out cases where the event
> - * listener reads events after the event generating process has
> - * already terminated. Depending on flag FAN_REPORT_FD_ERROR,
> - * report either -ESRCH or FAN_NOPIDFD to the event listener in
> - * those cases with all other pidfd creation errors reported as
> - * the error code itself or as FAN_EPIDFD.
> - */
> - if (metadata.pid && pid_has_task(event->pid, PIDTYPE_TGID))
> - pidfd = pidfd_prepare(event->pid, 0, &pidfd_file);
> + if (FAN_GROUP_FLAG(group, FAN_REPORT_TID))
> + pidfd_flags |= PIDFD_THREAD;
> +
> + if (metadata.pid)
> + pidfd = pidfd_prepare(event->pid, pidfd_flags, &pidfd_file);
>
> if (!FAN_GROUP_FLAG(group, FAN_REPORT_FD_ERROR) && pidfd < 0)
> pidfd = pidfd == -ESRCH ? FAN_NOPIDFD : FAN_EPIDFD;
> @@ -1628,14 +1617,6 @@ SYSCALL_DEFINE2(fanotify_init, unsigned int, flags, unsigned int, event_f_flags)
> #endif
> return -EINVAL;
>
> - /*
> - * A pidfd can only be returned for a thread-group leader; thus
> - * FAN_REPORT_PIDFD and FAN_REPORT_TID need to remain mutually
> - * exclusive.
> - */
> - if ((flags & FAN_REPORT_PIDFD) && (flags & FAN_REPORT_TID))
> - return -EINVAL;
> -
> /* Don't allow mixing mnt events with inode events for now */
> if (flags & FAN_REPORT_MNT) {
> if (class != FAN_CLASS_NOTIF)
> --
> 2.54.0
>