Re: [RFC PATCH 1/7] FUSE: Add the fsnotify opcode and in/out structs to FUSE

From: Vivek Goyal
Date: Wed Oct 27 2021 - 17:46:14 EST


On Tue, Oct 26, 2021 at 05:56:03PM +0300, Amir Goldstein wrote:
> On Mon, Oct 25, 2021 at 11:47 PM Ioannis Angelakopoulos
> <iangelak@xxxxxxxxxx> wrote:
> >
> > Since fsnotify is the backend for the inotify subsystem all the backend
> > code implementation we add is related to fsnotify.
> >
> > To support an fsnotify request in FUSE and specifically virtiofs we add a
> > new opcode for the FSNOTIFY (51) operation request in the "fuse.h" header.
> >
> > Also add the "fuse_notify_fsnotify_in" and "fuse_notify_fsnotify_out"
> > structs that are responsible for passing the fsnotify/inotify related data
> > to and from the FUSE server.
> >
> > Signed-off-by: Ioannis Angelakopoulos <iangelak@xxxxxxxxxx>
> > ---
> > include/uapi/linux/fuse.h | 23 ++++++++++++++++++++++-
> > 1 file changed, 22 insertions(+), 1 deletion(-)
> >
> > diff --git a/include/uapi/linux/fuse.h b/include/uapi/linux/fuse.h
> > index 46838551ea84..418b7fc72417 100644
> > --- a/include/uapi/linux/fuse.h
> > +++ b/include/uapi/linux/fuse.h
> > @@ -186,6 +186,9 @@
> > * - add FUSE_SYNCFS
> > * 7.35
> > * - add FUSE_NOTIFY_LOCK
> > + * 7.36
> > + * - add FUSE_HAVE_FSNOTIFY
> > + * - add fuse_notify_fsnotify_(in,out)
> > */
> >
> > #ifndef _LINUX_FUSE_H
> > @@ -221,7 +224,7 @@
> > #define FUSE_KERNEL_VERSION 7
> >
> > /** Minor version number of this interface */
> > -#define FUSE_KERNEL_MINOR_VERSION 35
> > +#define FUSE_KERNEL_MINOR_VERSION 36
> >
> > /** The node ID of the root inode */
> > #define FUSE_ROOT_ID 1
> > @@ -338,6 +341,7 @@ struct fuse_file_lock {
> > * write/truncate sgid is killed only if file has group
> > * execute permission. (Same as Linux VFS behavior).
> > * FUSE_SETXATTR_EXT: Server supports extended struct fuse_setxattr_in
> > + * FUSE_HAVE_FSNOTIFY: remote fsnotify/inotify event subsystem support
> > */
> > #define FUSE_ASYNC_READ (1 << 0)
> > #define FUSE_POSIX_LOCKS (1 << 1)
> > @@ -369,6 +373,7 @@ struct fuse_file_lock {
> > #define FUSE_SUBMOUNTS (1 << 27)
> > #define FUSE_HANDLE_KILLPRIV_V2 (1 << 28)
> > #define FUSE_SETXATTR_EXT (1 << 29)
> > +#define FUSE_HAVE_FSNOTIFY (1 << 30)
> >
> > /**
> > * CUSE INIT request/reply flags
> > @@ -515,6 +520,7 @@ enum fuse_opcode {
> > FUSE_SETUPMAPPING = 48,
> > FUSE_REMOVEMAPPING = 49,
> > FUSE_SYNCFS = 50,
> > + FUSE_FSNOTIFY = 51,
> >
> > /* CUSE specific operations */
> > CUSE_INIT = 4096,
> > @@ -532,6 +538,7 @@ enum fuse_notify_code {
> > FUSE_NOTIFY_RETRIEVE = 5,
> > FUSE_NOTIFY_DELETE = 6,
> > FUSE_NOTIFY_LOCK = 7,
> > + FUSE_NOTIFY_FSNOTIFY = 8,
> > FUSE_NOTIFY_CODE_MAX,
> > };
> >
> > @@ -571,6 +578,20 @@ struct fuse_getattr_in {
> > uint64_t fh;
> > };
> >
> > +struct fuse_notify_fsnotify_out {
> > + uint64_t inode;
>
> 64bit inode is not a good unique identifier of the object.

I think he wants to store 64bit nodeid (internal to fuse so that client
and server can figure out which inode they are talking about). But I
think you are concerned about what happens if an event arrived for an
inode after inode has been released and nodeid possibly used for some
other inode. And then we will find that new inode in guest cache and
end up associating event with wrong inode.

Generation number will help in the sense that server has a chance
to always update generation number on lookup. So even if nodeid
is reused, generation number will make make sure we don't end
up associating this event with reused node id inode. I guess
makes sense.

> you need to either include the generation in object identifier
> or much better use the object's nfs file handle, the same way
> that fanotify stores object identifiers.

I think nfs file handle is much more complicated and its a separate
project altogether. I am assuming we are talking about persistent
nfs file handle as generated by host. I think biggest issue we faced
with that is that guest is untrusted and we don't want to resolve
file handle provided by guest on host otherwise guest can craft
file handles and possibly be able to open other files on same filesystem
outside shared dir.

>
> > + uint64_t mask;
> > + uint32_t namelen;
> > + uint32_t cookie;
>
> I object to persisting with the two-events-joined-by-cookie design.
> Any new design should include a single event for rename
> with information about src and dst.
>
> I know this is inconvenient, but we are NOT going to create a "remote inotify"
> interface, we need to create a "remote fsnotify" interface and if server wants
> to use inotify, it will need to join the disjoined MOVE_FROM/TO event into
> a single "remote event", that FUSE will use to call fsnotify_move().

man inotify says following.

" Matching up the IN_MOVED_FROM and IN_MOVED_TO event pair generated by
rename(2) is thus inherently racy. (Don't forget that if an object is
renamed outside of a monitored directory, there may not even be an
IN_MOVED_TO event.)"

So if guest is no monitoring target dir of renamed file, then we will not
even get IN_MOVED_TO. In that case we can't merge two events into one.

And this sounds like inotify/fanotify interface needs to come up with
an merged event and in that case remote filesystem will simply propagate
that event. (Instead of coming up with a new event only for remote
filesystems. Sounds like this is not a problem limited to remote
filesystems only).

Thanks
Vivek