Re: [PATCH v4 3/3] fs: introduce getfsxattrat and setfsxattrat syscalls

From: Amir Goldstein
Date: Thu Mar 27 2025 - 07:42:29 EST


On Thu, Mar 27, 2025 at 10:33 AM Andrey Albershteyn <aalbersh@xxxxxxxxxx> wrote:
>
> On 2025-03-23 09:56:25, Amir Goldstein wrote:
> > On Fri, Mar 21, 2025 at 8:49 PM Andrey Albershteyn <aalbersh@xxxxxxxxxx> wrote:
> > >
> > > From: Andrey Albershteyn <aalbersh@xxxxxxxxxx>
> > >
> > > Introduce getfsxattrat and setfsxattrat syscalls to manipulate inode
> > > extended attributes/flags. The syscalls take parent directory fd and
> > > path to the child together with struct fsxattr.
> > >
> > > This is an alternative to FS_IOC_FSSETXATTR ioctl with a difference
> > > that file don't need to be open as we can reference it with a path
> > > instead of fd. By having this we can manipulated inode extended
> > > attributes not only on regular files but also on special ones. This
> > > is not possible with FS_IOC_FSSETXATTR ioctl as with special files
> > > we can not call ioctl() directly on the filesystem inode using fd.
> > >
> > > This patch adds two new syscalls which allows userspace to get/set
> > > extended inode attributes on special files by using parent directory
> > > and a path - *at() like syscall.
> > >
> > > CC: linux-api@xxxxxxxxxxxxxxx
> > > CC: linux-fsdevel@xxxxxxxxxxxxxxx
> > > CC: linux-xfs@xxxxxxxxxxxxxxx
> > > Signed-off-by: Andrey Albershteyn <aalbersh@xxxxxxxxxx>
> > > Acked-by: Arnd Bergmann <arnd@xxxxxxxx>
> > > ---
> > ...
> > > +SYSCALL_DEFINE5(setfsxattrat, int, dfd, const char __user *, filename,
> > > + struct fsxattr __user *, ufsx, size_t, usize,
> > > + unsigned int, at_flags)
> > > +{
> > > + struct fileattr fa;
> > > + struct path filepath;
> > > + int error;
> > > + unsigned int lookup_flags = 0;
> > > + struct filename *name;
> > > + struct mnt_idmap *idmap;.
> >
> > > + struct dentry *dentry;
> > > + struct vfsmount *mnt;
> > > + struct fsxattr fsx = {};
> > > +
> > > + BUILD_BUG_ON(sizeof(struct fsxattr) < FSXATTR_SIZE_VER0);
> > > + BUILD_BUG_ON(sizeof(struct fsxattr) != FSXATTR_SIZE_LATEST);
> > > +
> > > + if ((at_flags & ~(AT_SYMLINK_NOFOLLOW | AT_EMPTY_PATH)) != 0)
> > > + return -EINVAL;
> > > +
> > > + if (!(at_flags & AT_SYMLINK_NOFOLLOW))
> > > + lookup_flags |= LOOKUP_FOLLOW;
> > > +
> > > + if (at_flags & AT_EMPTY_PATH)
> > > + lookup_flags |= LOOKUP_EMPTY;
> > > +
> > > + if (usize > PAGE_SIZE)
> > > + return -E2BIG;
> > > +
> > > + if (usize < FSXATTR_SIZE_VER0)
> > > + return -EINVAL;
> > > +
> > > + error = copy_struct_from_user(&fsx, sizeof(struct fsxattr), ufsx, usize);
> > > + if (error)
> > > + return error;
> > > +
> > > + fsxattr_to_fileattr(&fsx, &fa);
> > > +
> > > + name = getname_maybe_null(filename, at_flags);
> > > + if (!name) {
> > > + CLASS(fd, f)(dfd);
> > > +
> > > + if (fd_empty(f))
> > > + return -EBADF;
> > > +
> > > + idmap = file_mnt_idmap(fd_file(f));
> > > + dentry = file_dentry(fd_file(f));
> > > + mnt = fd_file(f)->f_path.mnt;
> > > + } else {
> > > + error = filename_lookup(dfd, name, lookup_flags, &filepath,
> > > + NULL);
> > > + if (error)
> > > + return error;
> > > +
> > > + idmap = mnt_idmap(filepath.mnt);
> > > + dentry = filepath.dentry;
> > > + mnt = filepath.mnt;
> > > + }
> > > +
> > > + error = mnt_want_write(mnt);
> > > + if (!error) {
> > > + error = vfs_fileattr_set(idmap, dentry, &fa);
> > > + if (error == -ENOIOCTLCMD)
> > > + error = -EOPNOTSUPP;
> >
> > This is awkward.
> > vfs_fileattr_set() should return -EOPNOTSUPP.
> > ioctl_setflags() could maybe convert it to -ENOIOCTLCMD,
> > but looking at similar cases ioctl_fiemap(), ioctl_fsfreeze() the
> > ioctl returns -EOPNOTSUPP.
> >
> > I don't think it is necessarily a bad idea to start returning
> > -EOPNOTSUPP instead of -ENOIOCTLCMD for the ioctl
> > because that really reflects the fact that the ioctl is now implemented
> > in vfs and not in the specific fs.
> >
> > and I think it would not be a bad idea at all to make that change
> > together with the merge of the syscalls as a sort of hint to userspace
> > that uses the ioctl, that the sycalls API exists.
> >
> > Thanks,
> > Amir.
> >
>
> Hmm, not sure what you're suggesting here. I see it as:
> - get/setfsxattrat should return EOPNOTSUPP as it make more sense
> than ENOIOCTLCMD
> - ioctl_setflags returns ENOIOCTLCMD which also expected
>
> Don't really see a reason to change what vfs_fileattr_set() returns
> and then copying this if() to other places or start returning
> EOPNOTSUPP.

ENOIOCTLCMD conceptually means that the ioctl command is unknown
This is not the case since ->fileattr_[gs]et() became a vfs API
the ioctl command is handled by vfs and it is known, but individual
filesystems may not support it, so conceptually, returning EOPNOTSUPP
from ioctl() is more correct these days, exactly as is done with the ioctls
FS_IOC_FIEMAP and FIFREEZE which were also historically per fs
ioctls and made into a vfs API.

The fact that bcachefs does not implement ->fileattr_[gs]et() and does
implement FS_IOC_FS[GS]ETXATTR is an oversight IMO, since it
was probably merged after the vfs conversion patch.

This mistake means that bcachefs fileattr cannot be copied up by
ovl_copy_fileattr() which uses the vfs API and NOT the ioctl.

However, if you would made the internal vfs API change that I suggested,
it will have broken ovl_real_fileattr_get() and ovl_copy_fileattr(),
so leave it for now - if I care enough I can do it later together with
fixing the overlayfs and fuse code.

Thanks,
Amir.