Re: [PATCH] bpf: use FS_USERNS_DELEGATABLE for bpffs

From: Christian Brauner

Date: Fri Feb 06 2026 - 07:33:30 EST


On Thu, Feb 05, 2026 at 11:45:41AM +0100, Alexander Mikhalitsyn wrote:
> From: Alexander Mikhalitsyn <aleksandr.mikhalitsyn@xxxxxxxxxxxxxx>
>
> Instead of FS_USERNS_MOUNT we should use recently introduced
> FS_USERNS_DELEGATABLE cause it better expresses what we
> really want to get there. Filesystem should not be allowed
> to be mounted by an unprivileged user, but at the same time
> we want to have sb->s_user_ns to point to the container's
> user namespace, at the same time superblock can only
> be created if capable(CAP_SYS_ADMIN) check is successful.
>
> Tested and no regressions noticed.
>
> No functional change intended.
>
> Link: https://lore.kernel.org/linux-fsdevel/6dd181bf9f6371339a6c31f58f582a9aac3bc36a.camel@xxxxxxxxxx [1]
> Fixes: 6fe01d3cbb92 ("bpf: Add BPF token delegation mount options to BPF FS")
> Cc: Alexei Starovoitov <ast@xxxxxxxxxx>
> Cc: Daniel Borkmann <daniel@xxxxxxxxxxxxx>
> Cc: Andrii Nakryiko <andrii@xxxxxxxxxx>
> Cc: Martin KaFai Lau <martin.lau@xxxxxxxxx>
> Cc: Eduard Zingerman <eddyz87@xxxxxxxxx>
> Cc: Song Liu <song@xxxxxxxxxx>
> Cc: Yonghong Song <yonghong.song@xxxxxxxxx>
> Cc: John Fastabend <john.fastabend@xxxxxxxxx>
> Cc: KP Singh <kpsingh@xxxxxxxxxx>
> Cc: Stanislav Fomichev <sdf@xxxxxxxxxxx>
> Cc: Hao Luo <haoluo@xxxxxxxxxx>
> Cc: Jiri Olsa <jolsa@xxxxxxxxxx>
> Cc: Jeff Layton <jlayton@xxxxxxxxxx>
> Cc: Christian Brauner <brauner@xxxxxxxxxx>
> Cc: bpf@xxxxxxxxxxxxxxx
> Cc: linux-fsdevel@xxxxxxxxxxxxxxx
> Cc: linux-kernel@xxxxxxxxxxxxxxx
> Signed-off-by: Alexander Mikhalitsyn <aleksandr.mikhalitsyn@xxxxxxxxxxxxxx>
> - RWB-tag from Jeff [1]
> Reviewed-by: Jeff Layton <jlayton@xxxxxxxxxx>
> ---
> kernel/bpf/inode.c | 6 +-----
> 1 file changed, 1 insertion(+), 5 deletions(-)
>
> diff --git a/kernel/bpf/inode.c b/kernel/bpf/inode.c
> index 9f866a010dad..d8dfdc846bd0 100644
> --- a/kernel/bpf/inode.c
> +++ b/kernel/bpf/inode.c
> @@ -1009,10 +1009,6 @@ static int bpf_fill_super(struct super_block *sb, struct fs_context *fc)
> struct inode *inode;
> int ret;
>
> - /* Mounting an instance of BPF FS requires privileges */
> - if (fc->user_ns != &init_user_ns && !capable(CAP_SYS_ADMIN))
> - return -EPERM;

Jeff's patch does:

if (user_ns != &init_user_ns &&
!(fc->fs_type->fs_flags & (FS_USERNS_MOUNT | FS_USERNS_DELEGATABLE))) {
errorfc(fc, "VFS: Mounting from non-initial user namespace is not allowed");
return ERR_PTR(-EPERM);
}

IOW, it only restricts the ability to end up in bpf_fill_super() if
FS_USERNS_DELEGATABLE is set. You still need to perform the permission
check in bpf_fill_super() though otherwise anyone can mount bpffs in an
unprivileged container now.

So either Jeff's patch needs to be changed to require
capable(CAP_SYS_ADMIN) when FS_USERNS_DELEGATABLE is set (which makes
sense to me in general) or the check needs to remain n bpf_fill_super().

@Jeff do you require capable(CAP_SYS_ADMIN) from within nfs? I think you
somehow must because otherwise what prevents a container from mounting
arbitrary servers?

> -
> ret = simple_fill_super(sb, BPF_FS_MAGIC, bpf_rfiles);
> if (ret)
> return ret;
> @@ -1085,7 +1081,7 @@ static struct file_system_type bpf_fs_type = {
> .init_fs_context = bpf_init_fs_context,
> .parameters = bpf_fs_parameters,
> .kill_sb = bpf_kill_super,
> - .fs_flags = FS_USERNS_MOUNT,
> + .fs_flags = FS_USERNS_DELEGATABLE,
> };
>
> static int __init bpf_init(void)
> --
> 2.47.3
>