Re: [RESEND RFC PATCH v2 11/14] vfs: Wire up security hooks for lsm-based device guard in userns

From: Christian Brauner
Date: Fri Nov 24 2023 - 12:06:48 EST


On Wed, Oct 25, 2023 at 11:42:21AM +0200, Michael Weiß wrote:
> Wire up security_inode_mknod_capns() in fs/namei.c. If implemented
> and access is granted by an lsm, check ns_capable() instead of the
> global CAP_MKNOD.
>
> Wire up security_sb_alloc_userns() in fs/super.c. If implemented
> and access is granted by an lsm, the created super block will allow
> access to device nodes also if it was created in a non-inital userns.
>
> Signed-off-by: Michael Weiß <michael.weiss@xxxxxxxxxxxxxxxxxxx>
> ---
> fs/namei.c | 16 +++++++++++++++-
> fs/super.c | 6 +++++-
> 2 files changed, 20 insertions(+), 2 deletions(-)
>
> diff --git a/fs/namei.c b/fs/namei.c
> index f601fcbdc4d2..1f68d160e2c0 100644
> --- a/fs/namei.c
> +++ b/fs/namei.c
> @@ -3949,6 +3949,20 @@ inline struct dentry *user_path_create(int dfd, const char __user *pathname,
> }
> EXPORT_SYMBOL(user_path_create);
>
> +static bool mknod_capable(struct inode *dir, struct dentry *dentry,
> + umode_t mode, dev_t dev)
> +{
> + /*
> + * In case of a security hook implementation check mknod in user
> + * namespace. Otherwise just check global capability.
> + */
> + int error = security_inode_mknod_nscap(dir, dentry, mode, dev);
> + if (!error)
> + return ns_capable(current_user_ns(), CAP_MKNOD);
> + else
> + return capable(CAP_MKNOD);
> +}
> +
> /**
> * vfs_mknod - create device node or file
> * @idmap: idmap of the mount the inode was found from
> @@ -3975,7 +3989,7 @@ int vfs_mknod(struct mnt_idmap *idmap, struct inode *dir,
> return error;
>
> if ((S_ISCHR(mode) || S_ISBLK(mode)) && !is_whiteout &&
> - !capable(CAP_MKNOD))
> + !mknod_capable(dir, dentry, mode, dev))
> return -EPERM;
>
> if (!dir->i_op->mknod)
> diff --git a/fs/super.c b/fs/super.c
> index 2d762ce67f6e..bb01db6d9986 100644
> --- a/fs/super.c
> +++ b/fs/super.c
> @@ -362,7 +362,11 @@ static struct super_block *alloc_super(struct file_system_type *type, int flags,
> }
> s->s_bdi = &noop_backing_dev_info;
> s->s_flags = flags;
> - if (s->s_user_ns != &init_user_ns)
> + /*
> + * We still have to think about this here. Several concerns exist
> + * about the security model, especially about malicious fuse.
> + */
> + if (s->s_user_ns != &init_user_ns && security_sb_alloc_userns(s))
> s->s_iflags |= SB_I_NODEV;

Hm, no.

We dont want to have security hooks called in alloc_super(). That's just
the wrong layer for this. This is deeply internal stuff where we should
avoid interfacing with other subsystems.

Removing SB_I_NODEV here is also problematic or at least overly broad
because you allow to circumvent this for _every_ filesystems including
stuff like proc and so on where that doesn't make any sense.