Re: [PATCH RFC 0/6] pidfs: implement file handle support

From: Amir Goldstein
Date: Sun Dec 01 2024 - 07:09:40 EST


On Sun, Dec 1, 2024 at 9:43 AM Christian Brauner <brauner@xxxxxxxxxx> wrote:
>
> On Sat, Nov 30, 2024 at 01:22:05PM +0100, Amir Goldstein wrote:
> > On Fri, Nov 29, 2024 at 2:39 PM Christian Brauner <brauner@xxxxxxxxxx> wrote:
> > >
> > > Hey,
> > >
> > > Now that we have the preliminaries to lookup struct pid based on its
> > > inode number alone we can implement file handle support.
> > >
> > > This is based on custom export operation methods which allows pidfs to
> > > implement permission checking and opening of pidfs file handles cleanly
> > > without hacking around in the core file handle code too much.
> > >
> > > This is lightly tested.
> >
> > With my comments addressed as you pushed to vfs-6.14.pidfs branch
> > in your tree, you may add to the patches posted:
> >
> > Reviewed-by: Amir Goldstein <amir73il@xxxxxxxxx>
> >
> > HOWEVER,
> > IMO there is still one thing that has to be addressed before merge -
> > We must make sure that nfsd cannot export pidfs.
> >
> > In principal, SB_NOUSER filesystems should not be accessible to
> > userspace paths, so exportfs should not be able to configure nfsd
> > export of pidfs, but maybe this limitation can be worked around by
> > using magic link paths?
>
> I don't see how. I might be missing details.

AFAIK, nfsd gets the paths to export from userspace via
svc_export_parse() => kern_path(buf, 0, &exp.ex_path)
afterwards check_export() validates exp.ex_path and I see that regular
files can be exported.
I suppose that a pidfs file can have a magic link path no?
The question is whether this magic link path could be passed to nfsd
via the exportfs UAPI.

>
> > I think it may be worth explicitly disallowing nfsd export of SB_NOUSER
> > filesystems and we could also consider blocking SB_KERNMOUNT,
> > but may there are users exporting ramfs?
>
> No need to restrict it if it's safe, I guess.
>
> > Jeff has mentioned that he thinks we are blocking export of cgroupfs
> > by nfsd, but I really don't see where that is being enforced.
> > The requirement for FS_REQUIRES_DEV in check_export() is weak
> > because user can overrule it with manual fsid argument to exportfs.
> > So maybe we disallow nfsd export of kernfs and backport to stable kernels
> > to be on the safe side?
>
> File handles and nfs export have become two distinct things and there
> filesystems based on kernfs, and pidfs want to support file handles
> without support nfs export.
>
> So I think instead of having nfs check what filesystems may be exported
> we should let the filesystems indicate that they cannot be exported and
> make nfs honour that.

Yes, I agree, but...

>
> So something like the untested sketch:
>
> diff --git a/fs/kernfs/mount.c b/fs/kernfs/mount.c
> index 1358c21837f1..a5c75cb1c812 100644
> --- a/fs/kernfs/mount.c
> +++ b/fs/kernfs/mount.c
> @@ -154,6 +154,7 @@ static const struct export_operations kernfs_export_ops = {
> .fh_to_dentry = kernfs_fh_to_dentry,
> .fh_to_parent = kernfs_fh_to_parent,
> .get_parent = kernfs_get_parent_dentry,
> + .flags = EXPORT_OP_FILE_HANDLE,
> };
>
> /**
> diff --git a/fs/nfsd/export.c b/fs/nfsd/export.c
> index eacafe46e3b6..170c5729e7f2 100644
> --- a/fs/nfsd/export.c
> +++ b/fs/nfsd/export.c
> @@ -417,6 +417,7 @@ static struct svc_export *svc_export_lookup(struct svc_export *);
> static int check_export(struct path *path, int *flags, unsigned char *uuid)
> {
> struct inode *inode = d_inode(path->dentry);
> + const struct export_operations *nop;
>
> /*
> * We currently export only dirs, regular files, and (for v4
> @@ -449,11 +450,16 @@ static int check_export(struct path *path, int *flags, unsigned char *uuid)
> return -EINVAL;
> }
>
> - if (!exportfs_can_decode_fh(inode->i_sb->s_export_op)) {
> + if (!exportfs_can_decode_fh(nop)) {
> dprintk("exp_export: export of invalid fs type.\n");
> return -EINVAL;
> }
>
> + if (nop && nop->flags & EXPORT_OP_FILE_HANDLE) {
> + dprintk("exp_export: filesystem only supports non-exportable file handles.\n");
> + return -EINVAL;
> + }
> +
> if (is_idmapped_mnt(path->mnt)) {
> dprintk("exp_export: export of idmapped mounts not yet supported.\n");
> return -EINVAL;
> diff --git a/fs/overlayfs/util.c b/fs/overlayfs/util.c
> index 9aa7493b1e10..d1646c0789e1 100644
> --- a/fs/overlayfs/util.c
> +++ b/fs/overlayfs/util.c
> @@ -83,10 +83,15 @@ void ovl_revert_creds(const struct cred *old_cred)
> */
> int ovl_can_decode_fh(struct super_block *sb)
> {
> + const struct export_operations *nop = sb->s_export_op;
> +
> if (!capable(CAP_DAC_READ_SEARCH))
> return 0;
>
> - if (!exportfs_can_decode_fh(sb->s_export_op))
> + if (!exportfs_can_decode_fh(nop))
> + return 0;
> +
> + if (nop && nop->flags & EXPORT_OP_FILE_HANDLE)
> return 0;
>
> return sb->s_export_op->encode_fh ? -1 : FILEID_INO32_GEN;
> diff --git a/fs/pidfs.c b/fs/pidfs.c
> index dde3e4e90ea9..9d98b5461dc7 100644
> --- a/fs/pidfs.c
> +++ b/fs/pidfs.c
> @@ -570,6 +570,7 @@ static const struct export_operations pidfs_export_operations = {
> .fh_to_dentry = pidfs_fh_to_dentry,
> .open = pidfs_export_open,
> .permission = pidfs_export_permission,
> + .flags = EXPORT_OP_FILE_HANDLE,
> };
>
> static int pidfs_init_inode(struct inode *inode, void *data)
> diff --git a/include/linux/exportfs.h b/include/linux/exportfs.h
> index a087606ace19..98f7cb17abee 100644
> --- a/include/linux/exportfs.h
> +++ b/include/linux/exportfs.h
> @@ -280,6 +280,7 @@ struct export_operations {
> */
> #define EXPORT_OP_FLUSH_ON_CLOSE (0x20) /* fs flushes file data on close */
> #define EXPORT_OP_ASYNC_LOCK (0x40) /* fs can do async lock request */
> +#define EXPORT_OP_FILE_HANDLE (0x80) /* fs only supports file handles, no proper export */

This is a bad name IMO, since pidfs clearly does support file handles
and supports the open_by_handle_at() UAPI.

I was going to suggest EXPORT_OP_NO_NFS_EXPORT, but it also
sounds silly, so maybe:

#define EXPORT_OP_LOCAL_FILE_HANDLE (0x80) /* fs only
supports local file handles, no nfs export */

With that you may add:

Reviewed-by: Amir Goldstein <amir73il@xxxxxxxxx>

Thanks,
Amir.