Re: [PATCH 0/2] proc: use subset option to hide some top-level procfs entries

From: Alexey Gladkov
Date: Fri Jun 05 2020 - 10:47:25 EST


On Thu, Jun 04, 2020 at 11:17:38PM -0500, Eric W. Biederman wrote:
> >> I am not going to seriously look at this for merging until after the
> >> merge window closes.
> >
> > OK. I'll wait.
>
> That will mean your patches can be based on -rc1.

OK.

> > Do you suggest to allow a user to mount procfs with hidepid=2,subset=pid
> > options? If so then this is an interesting idea.
>
> The key part would be subset=pid. You would still need to be root in
> your user namespace, and mount namespace. You would not need to have a
> separate copy of proc with nothing hidden already mounted.

Can you tell me more about your idea ? I thought I understood it, but it
seems my understanding is different.

I thought that you are suggesting that you move in the direction of
allowing procfs to mount an unprivileged user.

> > I can not agree with this because I do not touch on other options.
> > The hidepid and subset=pid has no relation to the visibility of regular
> > files. On the other hand, in procfs there is absolutely no way to restrict
> > access other than selinux.
>
> Untrue. At a practical level the user namespace greatly restricts
> access to proc because many of the non-process files are limited to
> global root only.

I am not worried about the files created in procfs by the kernel itself
because the permissions are set correctly and are checked correctly.

I worry about kernel modules, especially about modules out of tree.

https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/usb/gadget/function/rndis.c#n904

I certainly understand that 0660 is not 0666, but still.

> > I know that java uses meminfo for sure.
> >
> > The purpose of this patch is to isolate the container from unwanted files
> > in procfs.
>
> If what we want is the ability not to use the original but to have
> a modified version of these files. We probably want empty files that
> serve as mount points.
>
> Or possibly a version of these files that takes into account
> restrictions. In either even we need to do the research through real
> programs and real kernel options to see what is our best option for
> exporting the limitations that programs have and deciding on the long
> term API for that.

Yes, but that's a slightly different story. It would be great if all of
these files provide modified information.

My patch is about those files that we donât know about and which we donât
want.

> If we research things and we decide the best way to let java know of
> it's limitations is to change /proc/meminfo. That needs to be a change
> that always applies to meminfo and is not controlled by options.
>
> > For now I'm just trying ti create a better way to restrict access in
> > the procfs than this since procfs is used in containers.
>
> Docker historically has been crap about having a sensible policy. The
> problem is that Docker wanted to allow real root in a container and
> somehow make it safe by blocking access to proc files and by dropping
> capabilities.
>
> Practically everything that Docker has done is much better and simpler by
> restricting the processes to a user namespace, with a root user whose
> uid is not the global root user.
>
> Which is why I want us to make certain we are doing something that makes
> sense, and is architecturally sound.

Ok. Then ignore this patchset.

> You have cleared the big hurdle and proc now has options that are
> usable. I really appreciate that. I am not opposed to the general
> direction you are going to find a way to make proc more usable. I just
> want our next step to be solid.

--
Rgrds, legion