Re: [PATCH v4 2/2] procfs/tasks: add a simple per-task procfs hidepid= field

From: Andy Lutomirski
Date: Fri Jan 20 2017 - 19:53:34 EST


On Fri, Jan 20, 2017 at 8:33 AM, Djalal Harouni <tixxdz@xxxxxxxxx> wrote:
> On Thu, Jan 19, 2017 at 8:52 PM, Andy Lutomirski <luto@xxxxxxxxxxxxxx> wrote:
>> On Thu, Jan 19, 2017 at 5:53 AM, Djalal Harouni <tixxdz@xxxxxxxxx> wrote:
> [...]
>>> Sure, the hidepid mount option is old enough, and this per-task
>>> hidepid is clearly defined only for procfs and per task, we can't add
>>> another switch that's relate to both a filesystem and pid namespaces,
>>> it will be a bit complicated and not really useful for cases that are
>>> in *same* pidns where *each* one have to mount its procfs, it will
>>> propagate. Also as noted by Lafcadio, the gid thing is a bit hard to
>>> use now.
>>
>> What I'm trying to say is that I want to understand a complete,
>> real-world use case. Adding a security-related per-task flag is can
>> be quite messy and requires a lot of careful thought to get right, and
>> I'd rather avoid it if at all possible.
>
> I do agree, but that's not what we are proposing here. This use case
> is limited we do not manipulate the creds of the task, there are no
> security transitions. The task does not change, its only related to
> procfs and pid entries there. Also the flag applies only to current
> task and not on remote ones... Nothing new here it's an extension of
> procfs hidepid.
>
>> I'm imaging something like a new RestrictPidVisisbility= option in
>> systemd. I agree that this is currently a mess to do. But maybe a
>
> Yes that's one use case, If we manage to land this I'll follow up with
> it... plus there is, I've a use case related to kubernetes where I do
> want to reduce the number of processes inside containers per pod to
> minimal. Some other cases are: lock down children where being
> unprivileged. Also as noted in other replies on today's desktop
> systems, under a normal user session, the user should see all
> processes of the system where the media player, browser etc have no
> business to see the process tree. This can be easily implemented when
> launching apps without the need to regain privileges...
>
>> simpler solution would be to add a new mount option local_hidepid to
>> procfs. If you set that option, then it overrides hidepid for that
>> instance. Most of these semi-sandboxed daemon processes already have
>> their own mount namespace, so the overhead should be minimal.
>
> Andy If that could work :-/ we have to re-write or adapt lot of
> things inside procfs... plus:
> Procfs is a miror to the current pid namespace. Mount options are not
> procfs but rather pid namespace. That would not work.

I agree that the kernel change to do it per task is very simple. But
this is an unfortunate slippery slope. What if you want to block off
everything in /proc that isn't associated with a PID? What if you
want to suppress /sys access? What if you want ot block *all*
non-current PIDs from being revealed in /proc? What if you want to
hide /proc/PID/cmdline?

I think that the right solution here is to fix procfs to understand
per-superblock mount options.

--Andy