Re: [PATCH] inotify: disallow watches on unsupported filesystems

From: Amir Goldstein
Date: Tue Mar 04 2025 - 15:04:29 EST


On Tue, Mar 4, 2025 at 8:06 PM Seyediman Seyedarab <imandevel@xxxxxxxxx> wrote:
>
> On 25/03/04 05:41PM, Amir Goldstein wrote:
> > On Tue, Mar 4, 2025 at 5:05 PM Seyediman Seyedarab <imandevel@xxxxxxxxx> wrote:
> > >
> > > On 25/03/04 12:57PM, Amir Goldstein wrote:
> > > > On Tue, Mar 4, 2025 at 8:59 AM Seyediman Seyedarab <imandevel@xxxxxxxxx> wrote:
>
> > > I understand why it might seem like disallowing users from monitoring
> > > these filesystems could break userspace in some way. BUT, programs
> > > work incorrectly precisely because they do not receive any information
> > > from the kernel, so in other words they are already broken. There is no
> > > way for them to know if the fs is supported or not. I mean, even we are
> > > not sure at the moment, then how would they know.
> >
> > Programs not knowing is a problem that could be fixed with a new API
> > or new init flag to fanotify/inotify.
> >
> > Existing programs that would break due to this change is unacceptable.
> >
>
> Maybe something like IN_DISALLOW_REMOTE could work for now, at least
> until remote change notifications are properly implemented for those
> specific filesystems? Later, if needed, it could evolve into a new API,
> and the flag could become the default behavior when passed to that API.
>
> > > As an example, 'Waybar' is a userspace program affected by this patch.
> > > Since it relies on monitoring sysfs, it isn't working properly anyway.
> > > This is also due to the issue mentioned earlier... inotify_add_watch()
> > > returns without an error, so the developers haven't realized that
> > > inotify isn't actually supported on sysfs. There are over five
> > > discussions regarding this issue that you can find them here:
> > > https://github.com/Alexays/Waybar/pull/3474
> > >
> >
> > You need to distinguish "inotify does not work"
> > from "inotify does not notify on 'remote' changes"
> > that is changes that happen on the network fs server or inside the
> > kernel (in sysfs case) vs. changes that happen via vfs syscalls
> > on the mounted fs, be it p9, cifs, sysfs.
> >
> > There are several discussions about supporting "remote change"
> > notifications for network filesystems - this is a more complex problem.
> >
> > In any case, I believe performing operations on the mounted fs
> > generated inotify events for all the fs that you listed and without
> > a claim that nobody is using this facility we cannot regress this
> > behavior without an opt-in from the application.
>
> Understood. So this is what I should work on (correct me if anything
> seems off):
> 1. Carefully list all filesystems where "remote" changes occur.
> 2. Introduce a flag like FS_DISALLOW_INOTIFY_REMOTE in fs_flags
> for these filesystems.
> 3. Provide an option for userspace, such as IN_DISALLOW_REMOTE,
> so applications can explicitly handle this behavior.
> 4. (Possibly later, if it makes sense) Introduce a new syscall where
> FS_DISALLOW_INOTIFY_REMOTE is the default behavior.
>

Generally, this is a correct way to add new functionality,
but I would rather not extend the inotify API.
fanotify API is (mostly) a super set of inotify API, so any extension
of the API better be of fanotify.

If you try to use fanotify API with FAN_REPORT_FID and
FAN_MARK_FILESYSTEM, for example, by running the tool
fsnotifywait --filesystem, I think that you will find out that many of the fs
that you wanted to blacklist already return -EOPNOTSUPP, because they
do not support nfs export and some return -ENODEV (e.g. fuse)
because they do not have an fsid.
So essentially, you (almost) already have the new API that you wanted.

As a matter of fact, before commit a95aef69a740 ("fanotify: support
reporting non-decodeable file handles") in kernel v6.6, FAN_MARK_INODE
would also yield the same result (i.e. fsnotifywait without
--filesystem) and that
is more or less equivalent to inotifywait, because unlike fsnotifywait
--filesystem,
it does not require CAP_SYS_ADMIN.

Please see if fsnotifywait --filesystem is failing to watch the filesystems
that you are interested in blacklisting and list the filesystems that do not
fail and you think should fail to watch.

If there are filesystems that do not fail (e.g. cifs) and you still want to
blacklist them, please argue your reasons.

If the behavior you get from fsnotifywait --filesystem is matching your
expectations and the only problem is that fsnotifywait without --filesystem
on recent kernel does not match your expectations, it would be easy
to add a fanotify_init flag like FAN_REPORT_DECODEABLE_FID,
which enforces the same strict requirements as --filesystem,
but does not require CAP_SYS_ADMIN.

But generally, the relation between the fs that you define as
"unreliable" to the fs that work with --filesystem is circumstantial.
A better way to identify fs that are remote fs is to check if they
implement the d_revalidate() operation.

It's easy to add an fanotify_init flag to disallow those remote fs,
but we really need to first better define what is the desired goal.
Exercise - try to write the man page of the proposed new flag
in a way that is clear to anyone who reads the description.
If you manage to do that, you are far more likely to argue your case.

Thanks,
Amir.