Re: [PATCH RFC 0/6] pidfs: implement file handle support

From: Christian Brauner
Date: Sun Dec 01 2024 - 07:44:41 EST


On Sun, Dec 01, 2024 at 01:09:17PM +0100, Amir Goldstein wrote:
> On Sun, Dec 1, 2024 at 9:43 AM Christian Brauner <brauner@xxxxxxxxxx> wrote:
> >
> > On Sat, Nov 30, 2024 at 01:22:05PM +0100, Amir Goldstein wrote:
> > > On Fri, Nov 29, 2024 at 2:39 PM Christian Brauner <brauner@xxxxxxxxxx> wrote:
> > > >
> > > > Hey,
> > > >
> > > > Now that we have the preliminaries to lookup struct pid based on its
> > > > inode number alone we can implement file handle support.
> > > >
> > > > This is based on custom export operation methods which allows pidfs to
> > > > implement permission checking and opening of pidfs file handles cleanly
> > > > without hacking around in the core file handle code too much.
> > > >
> > > > This is lightly tested.
> > >
> > > With my comments addressed as you pushed to vfs-6.14.pidfs branch
> > > in your tree, you may add to the patches posted:
> > >
> > > Reviewed-by: Amir Goldstein <amir73il@xxxxxxxxx>
> > >
> > > HOWEVER,
> > > IMO there is still one thing that has to be addressed before merge -
> > > We must make sure that nfsd cannot export pidfs.
> > >
> > > In principal, SB_NOUSER filesystems should not be accessible to
> > > userspace paths, so exportfs should not be able to configure nfsd
> > > export of pidfs, but maybe this limitation can be worked around by
> > > using magic link paths?
> >
> > I don't see how. I might be missing details.
>
> AFAIK, nfsd gets the paths to export from userspace via
> svc_export_parse() => kern_path(buf, 0, &exp.ex_path)
> afterwards check_export() validates exp.ex_path and I see that regular
> files can be exported.
> I suppose that a pidfs file can have a magic link path no?
> The question is whether this magic link path could be passed to nfsd
> via the exportfs UAPI.

Ah, ok. I see what you mean. You're thinking about specifying
/proc/<pid>/fd/<pidfd> in /etc/exports. Yes, that would work.

>
> >
> > > I think it may be worth explicitly disallowing nfsd export of SB_NOUSER
> > > filesystems and we could also consider blocking SB_KERNMOUNT,
> > > but may there are users exporting ramfs?
> >
> > No need to restrict it if it's safe, I guess.
> >
> > > Jeff has mentioned that he thinks we are blocking export of cgroupfs
> > > by nfsd, but I really don't see where that is being enforced.
> > > The requirement for FS_REQUIRES_DEV in check_export() is weak
> > > because user can overrule it with manual fsid argument to exportfs.
> > > So maybe we disallow nfsd export of kernfs and backport to stable kernels
> > > to be on the safe side?
> >
> > File handles and nfs export have become two distinct things and there
> > filesystems based on kernfs, and pidfs want to support file handles
> > without support nfs export.
> >
> > So I think instead of having nfs check what filesystems may be exported
> > we should let the filesystems indicate that they cannot be exported and
> > make nfs honour that.
>
> Yes, I agree, but...
>
> >
> > So something like the untested sketch:
> >
> > diff --git a/fs/kernfs/mount.c b/fs/kernfs/mount.c
> > index 1358c21837f1..a5c75cb1c812 100644
> > --- a/fs/kernfs/mount.c
> > +++ b/fs/kernfs/mount.c
> > @@ -154,6 +154,7 @@ static const struct export_operations kernfs_export_ops = {
> > .fh_to_dentry = kernfs_fh_to_dentry,
> > .fh_to_parent = kernfs_fh_to_parent,
> > .get_parent = kernfs_get_parent_dentry,
> > + .flags = EXPORT_OP_FILE_HANDLE,
> > };
> >
> > /**
> > diff --git a/fs/nfsd/export.c b/fs/nfsd/export.c
> > index eacafe46e3b6..170c5729e7f2 100644
> > --- a/fs/nfsd/export.c
> > +++ b/fs/nfsd/export.c
> > @@ -417,6 +417,7 @@ static struct svc_export *svc_export_lookup(struct svc_export *);
> > static int check_export(struct path *path, int *flags, unsigned char *uuid)
> > {
> > struct inode *inode = d_inode(path->dentry);
> > + const struct export_operations *nop;
> >
> > /*
> > * We currently export only dirs, regular files, and (for v4
> > @@ -449,11 +450,16 @@ static int check_export(struct path *path, int *flags, unsigned char *uuid)
> > return -EINVAL;
> > }
> >
> > - if (!exportfs_can_decode_fh(inode->i_sb->s_export_op)) {
> > + if (!exportfs_can_decode_fh(nop)) {
> > dprintk("exp_export: export of invalid fs type.\n");
> > return -EINVAL;
> > }
> >
> > + if (nop && nop->flags & EXPORT_OP_FILE_HANDLE) {
> > + dprintk("exp_export: filesystem only supports non-exportable file handles.\n");
> > + return -EINVAL;
> > + }
> > +
> > if (is_idmapped_mnt(path->mnt)) {
> > dprintk("exp_export: export of idmapped mounts not yet supported.\n");
> > return -EINVAL;
> > diff --git a/fs/overlayfs/util.c b/fs/overlayfs/util.c
> > index 9aa7493b1e10..d1646c0789e1 100644
> > --- a/fs/overlayfs/util.c
> > +++ b/fs/overlayfs/util.c
> > @@ -83,10 +83,15 @@ void ovl_revert_creds(const struct cred *old_cred)
> > */
> > int ovl_can_decode_fh(struct super_block *sb)
> > {
> > + const struct export_operations *nop = sb->s_export_op;
> > +
> > if (!capable(CAP_DAC_READ_SEARCH))
> > return 0;
> >
> > - if (!exportfs_can_decode_fh(sb->s_export_op))
> > + if (!exportfs_can_decode_fh(nop))
> > + return 0;
> > +
> > + if (nop && nop->flags & EXPORT_OP_FILE_HANDLE)
> > return 0;
> >
> > return sb->s_export_op->encode_fh ? -1 : FILEID_INO32_GEN;
> > diff --git a/fs/pidfs.c b/fs/pidfs.c
> > index dde3e4e90ea9..9d98b5461dc7 100644
> > --- a/fs/pidfs.c
> > +++ b/fs/pidfs.c
> > @@ -570,6 +570,7 @@ static const struct export_operations pidfs_export_operations = {
> > .fh_to_dentry = pidfs_fh_to_dentry,
> > .open = pidfs_export_open,
> > .permission = pidfs_export_permission,
> > + .flags = EXPORT_OP_FILE_HANDLE,
> > };
> >
> > static int pidfs_init_inode(struct inode *inode, void *data)
> > diff --git a/include/linux/exportfs.h b/include/linux/exportfs.h
> > index a087606ace19..98f7cb17abee 100644
> > --- a/include/linux/exportfs.h
> > +++ b/include/linux/exportfs.h
> > @@ -280,6 +280,7 @@ struct export_operations {
> > */
> > #define EXPORT_OP_FLUSH_ON_CLOSE (0x20) /* fs flushes file data on close */
> > #define EXPORT_OP_ASYNC_LOCK (0x40) /* fs can do async lock request */
> > +#define EXPORT_OP_FILE_HANDLE (0x80) /* fs only supports file handles, no proper export */
>
> This is a bad name IMO, since pidfs clearly does support file handles
> and supports the open_by_handle_at() UAPI.
>
> I was going to suggest EXPORT_OP_NO_NFS_EXPORT, but it also
> sounds silly, so maybe:
>
> #define EXPORT_OP_LOCAL_FILE_HANDLE (0x80) /* fs only
> supports local file handles, no nfs export */

Thank you. I'll send a reply with a proper patch to this thread in a second.

> With that you may add:
>
> Reviewed-by: Amir Goldstein <amir73il@xxxxxxxxx>
>
> Thanks,
> Amir.