Re: [PATCH v9 4/5] proc: Skip the visibility check if subset=pid is used
From: Christian Brauner
Date: Thu Apr 16 2026 - 09:32:24 EST
On Thu, Apr 16, 2026 at 10:46:50PM +1000, Aleksa Sarai wrote:
> On 2026-04-16, Aleksa Sarai <cyphar@xxxxxxxxxx> wrote:
> > On 2026-04-13, Alexey Gladkov <legion@xxxxxxxxxx> wrote:
> > > When procfs is mounted with the subset=pid option, all system files and
> > > directories from the root of the filesystem are not accessible in
> > > userspace. Only dynamic information about processes is available, which
> > > cannot be hidden with overmount.
> > >
> > > For this reason, checking for full visibility is not relevant if mounting
> > > is performed with the subset=pid option.
> > >
> > > Signed-off-by: Alexey Gladkov <legion@xxxxxxxxxx>
> > > ---
> >
> > > -static bool mount_too_revealing(const struct super_block *sb, int *new_mnt_flags)
> > > +static bool mount_too_revealing(struct fs_context *fc, int *new_mnt_flags)
> > > {
> > > const unsigned long required_iflags = SB_I_NOEXEC | SB_I_NODEV;
> > > struct mnt_namespace *ns = current->nsproxy->mnt_ns;
> > > + const struct super_block *sb = fc->root->d_sb;
> > > unsigned long s_iflags;
> > >
> > > if (ns->user_ns == &init_user_ns)
> > > @@ -6388,7 +6387,7 @@ static bool mount_too_revealing(const struct super_block *sb, int *new_mnt_flags
> > > return true;
> > > }
> > >
> > > - return !mnt_already_visible(ns, sb, new_mnt_flags);
> > > + return (!fc->skip_visibility && !mnt_already_visible(ns, sb, new_mnt_flags));
> > > }
> >
> > Unless I'm missing something (I haven't tested this locally yet, sorry),
> > this will allow you to bypass mount_too_revealing() even for
> > non-subset=pid mounts because once you create a subset=pid mount then a
> > regular procfs mount will see the subset=pid mount and permit it.
> >
> > I think the solution is quite simple -- you can also skip super-blocks
> > that have fc->skip_visibility set in mnt_already_visible().
>
> I now see that check was present in v8 but I guess its importance wasn't
> obvious. I guess this means we will need to reintroduce
> SB_I_USERNS_ALLOW_REVEALING. :/
I've been playing with something else. So first we should move
SB_I_USERNS_REVEALING to an fs_type flag. It's not an optional thing and
always set and never removed. That also means we can simplify
sysfs_get_tree() to just kernfs_get_tree().
And then we raise SB_I_USERNS_RESTRICTED on all procfs mounts with
pid_only and disallow using them for calculating mount permissions for
unrestricted procfs mounts.
Aleksa?