Re: [PATCH v4 2/2] fanotify: allow reporting pidfds for reaped tasks

From: Christian Brauner

Date: Wed Jun 03 2026 - 07:28:55 EST

On Wed, Jun 03, 2026 at 12:58:09PM +0200, Amir Goldstein wrote:
> On Wed, Jun 3, 2026 at 12:28 PM Christian Brauner <brauner@xxxxxxxxxx> wrote:
> >
> > On Wed, Jun 03, 2026 at 12:00:42PM +0200, Amir Goldstein wrote:
> > > On Wed, Jun 3, 2026 at 5:11 AM AnonymeMeow <anonymemeow@xxxxxxxxx> wrote:
> > > >
> > > > Sashiko said that pidfs_register_pid() can fail under memory pressure,
> > > > since it allocates memory using GFP_KERNEL without a way to pass in
> > > > __GFP_NOFAIL. This can cause fanotify_alloc_event() to silently drop
> > > > the event even when the group has an unlimited queue length.
> > > >
> > > > Should we make pidfs registration best-effort instead and still return
> > > > the allocated event if pidfs_register_pid() fails? But IMO this will
> > > > make the API semantically ambiguous. If the kernel guarantees that the
> > > > event pid is registered with pidfs when pidfd reporting is requested,
> > > > then a later FAN_NOPIDFD or -ESRCH can only be caused by the task being
> > > > invisible within the reader's pid namespace. But if we allow best-effort
> > > > registration, the event may still carry a pinned struct pid, but pidfs
> > > > registration may have failed due to memory pressure. If userspace reads
> > > > the event after the task has been reaped, it still gets a FAN_NOPIDFD
> > > > or -ESRCH due to failed pidfs registration caused by memory pressure,
> > > > IMO this should be reported as a -ENOMEM instead of -ESRCH. But
> > > > currently it seems we don't have a good way to register the pid with
> > > > __GFP_NOFAIL or __GFP_RETRY_MAYFAIL, so what would be the preferred
> > > > approach to handle this situation?
> > >
> > > Thanks for noticing the oddity and explaining the options.
> > >
> > > We could fail the event in case on ENOMEM because the system
> > > is likely is a bad shape anyway and other events are likely to fail
> > > allocation themselves, so I am not objecting to your patch as is,
> > >
> > > but I am slightly leaning towards the semantically ambiguous API.
> >
> > As usual, I think a semantically ambiguous api should be avoided.
> >
> > > I don't think that we are going to change the man page to say that
> > > FAN_REPORT_PIDFD is guaranteed to return a pidfd, maybe we just
> > > add a NOTE about increased likelihood of getting a pidfd since kernel XXX
> > > and mention the known cases of pid namespace and ENOMEM.
> >
> > The pid namespace shouldn't matter.
>
> DO you mean that the event reader inside a pidns always gets a pidfd but
> it will not be able to query the pid of the process outside the pidns
> or do you mean something else?

So they will always get a pidfd since fanotify is correctly using
pidfd_prepare() directly. But when the reader tries to resolve the pidfd
to say a pid or retrieve info via the pidfd info ioctl they would fail
to do so (well, the pid will resolve to 0 which is the global indicator
for "can't be resolved in your pidns").