Re: [PATCH] bpf: use FS_USERNS_DELEGATABLE for bpffs

From: Christian Brauner

Date: Mon Feb 09 2026 - 06:12:21 EST


On Fri, Feb 06, 2026 at 01:42:17PM +0100, Alexander Mikhalitsyn wrote:
> Am Fr., 6. Feb. 2026 um 13:33 Uhr schrieb Christian Brauner
> <brauner@xxxxxxxxxx>:
> >
> > On Thu, Feb 05, 2026 at 11:45:41AM +0100, Alexander Mikhalitsyn wrote:
> > > From: Alexander Mikhalitsyn <aleksandr.mikhalitsyn@xxxxxxxxxxxxxx>
> > >
> > > Instead of FS_USERNS_MOUNT we should use recently introduced
> > > FS_USERNS_DELEGATABLE cause it better expresses what we
> > > really want to get there. Filesystem should not be allowed
> > > to be mounted by an unprivileged user, but at the same time
> > > we want to have sb->s_user_ns to point to the container's
> > > user namespace, at the same time superblock can only
> > > be created if capable(CAP_SYS_ADMIN) check is successful.
> > >
> > > Tested and no regressions noticed.
> > >
> > > No functional change intended.
> > >
> > > Link: https://lore.kernel.org/linux-fsdevel/6dd181bf9f6371339a6c31f58f582a9aac3bc36a.camel@xxxxxxxxxx [1]
> > > Fixes: 6fe01d3cbb92 ("bpf: Add BPF token delegation mount options to BPF FS")
> > > Cc: Alexei Starovoitov <ast@xxxxxxxxxx>
> > > Cc: Daniel Borkmann <daniel@xxxxxxxxxxxxx>
> > > Cc: Andrii Nakryiko <andrii@xxxxxxxxxx>
> > > Cc: Martin KaFai Lau <martin.lau@xxxxxxxxx>
> > > Cc: Eduard Zingerman <eddyz87@xxxxxxxxx>
> > > Cc: Song Liu <song@xxxxxxxxxx>
> > > Cc: Yonghong Song <yonghong.song@xxxxxxxxx>
> > > Cc: John Fastabend <john.fastabend@xxxxxxxxx>
> > > Cc: KP Singh <kpsingh@xxxxxxxxxx>
> > > Cc: Stanislav Fomichev <sdf@xxxxxxxxxxx>
> > > Cc: Hao Luo <haoluo@xxxxxxxxxx>
> > > Cc: Jiri Olsa <jolsa@xxxxxxxxxx>
> > > Cc: Jeff Layton <jlayton@xxxxxxxxxx>
> > > Cc: Christian Brauner <brauner@xxxxxxxxxx>
> > > Cc: bpf@xxxxxxxxxxxxxxx
> > > Cc: linux-fsdevel@xxxxxxxxxxxxxxx
> > > Cc: linux-kernel@xxxxxxxxxxxxxxx
> > > Signed-off-by: Alexander Mikhalitsyn <aleksandr.mikhalitsyn@xxxxxxxxxxxxxx>
> > > - RWB-tag from Jeff [1]
> > > Reviewed-by: Jeff Layton <jlayton@xxxxxxxxxx>
> > > ---
> > > kernel/bpf/inode.c | 6 +-----
> > > 1 file changed, 1 insertion(+), 5 deletions(-)
> > >
> > > diff --git a/kernel/bpf/inode.c b/kernel/bpf/inode.c
> > > index 9f866a010dad..d8dfdc846bd0 100644
> > > --- a/kernel/bpf/inode.c
> > > +++ b/kernel/bpf/inode.c
> > > @@ -1009,10 +1009,6 @@ static int bpf_fill_super(struct super_block *sb, struct fs_context *fc)
> > > struct inode *inode;
> > > int ret;
> > >
> > > - /* Mounting an instance of BPF FS requires privileges */
> > > - if (fc->user_ns != &init_user_ns && !capable(CAP_SYS_ADMIN))
> > > - return -EPERM;
> >
> > Jeff's patch does:
> >
> > if (user_ns != &init_user_ns &&
> > !(fc->fs_type->fs_flags & (FS_USERNS_MOUNT | FS_USERNS_DELEGATABLE))) {
> > errorfc(fc, "VFS: Mounting from non-initial user namespace is not allowed");
> > return ERR_PTR(-EPERM);
> > }
>
> Hi Christian,
>
> >
> > IOW, it only restricts the ability to end up in bpf_fill_super() if
> > FS_USERNS_DELEGATABLE is set. You still need to perform the permission
> > check in bpf_fill_super() though otherwise anyone can mount bpffs in an
> > unprivileged container now.
>
> Yeah, this is what mount_capable(struct fs_context *fc) does. I'm removing
> FS_USERNS_MOUNT so know it checks capable(CAP_SYS_ADMIN), instead of
> ns_capable(fc->user_ns, CAP_SYS_ADMIN).
>
> No functional changes here.

Ah right, I remember. We require global CAP_SYS_ADMIN if FS_USERNS_MOUNT
isn't set. That's great. Thanks!

I can route Jeff's patch as fix since the original change technically
regressed nfs a while ago.