Re: [RFC] Add option to mount only a pids subset

From: Andy Lutomirski
Date: Sun Mar 12 2017 - 23:21:00 EST


On Sat, Mar 11, 2017 at 6:13 PM, Al Viro <viro@xxxxxxxxxxxxxxxxxx> wrote:
> PS: AFAICS, simple mount --bind of your pid-only mount will suddenly
> expose the full thing. And as for the lifetimes making no sense...
> note that you are simply not freeing these structures of yours.
> Try to handle that and you'll get a serious PITA all over the
> place.
>
> What are you trying to achieve, anyway? Why not add a second vfsmount
> pointer per pid_namespace and make it initialized on demand, at the
> first attempt of no-pid mount? Just have a separate no-pid instance
> created for those namespaces where it had been asked for, with
> separate superblock and dentry tree not containing anything other
> that pid-only parts + self + thread-self...

Can't we just make procfs work like most other filesystems and have
each mount have its own superblock? If we need to do something funky
to stat() output to keep existing userspace working, I think that's
okay.

As far as I can tell, proc_mnt is very nearly useless -- it seems to
be used for proc_flush_task (which claims to be purely an optimization
and could be preserved in the common case where there's only one
relevant mount) and for sysctl_binary. For the latter, we could
create proc_mnt but make actual user-initiated mounts be new
superblocks anyway.