Re: [PATCH] [RFC] Make it easier to harden /proc/

From: Eric W. Biederman
Date: Wed Mar 16 2011 - 17:18:12 EST


Richard Weinberger <richard@xxxxxx> writes:

2> Am Mittwoch 16 MÃrz 2011, 21:45:45 schrieb Arnd Bergmann:
>> On Wednesday 16 March 2011 21:08:16 Richard Weinberger wrote:
>> > Am Mittwoch 16 MÃrz 2011, 20:55:49 schrieb Kees Cook:
>> > > On Wed, Mar 16, 2011 at 08:31:47PM +0100, Richard Weinberger wrote:
>> > > > When containers like LXC are used a unprivileged and jailed
>> > > > root user can still write to critical files in /proc/.
>> > > > E.g: /proc/sys/kernel/{sysrq, panic, panic_on_oops, ... }
>> > > >
>> > > > This new restricted attribute makes it possible to protect such
>> > > > files. When restricted is set to true root needs CAP_SYS_ADMIN
>> > > > to into the file.
>> > >
>> > > I was thinking about this too. I'd prefer more fine-grained control
>> > > in this area, since some sysctl entries aren't strictly controlled by
>> > > CAP_SYS_ADMIN (e.g. mmap_min_addr is already checking CAP_SYS_RAWIO).
>> > >
>> > > How about this instead?
>> >
>> > Good Idea.
>> > May we should also consider a per-directory restriction.
>> > Every file in /proc/sys/{kernel/, vm/, fs/, dev/} needs a protection.
>> > It would be much easier to set the protection on the parent directory
>> > instead of protecting file by file...
>>
>> How does this interact with the per-namespace sysctls that Eric
>> Biederman added a few years ago?
>
> Do you mean CONFIG_{UTS, UPC, USER, NET,}_NS?
>
>> I had expected that any dangerous sysctl would not be visible in
>> an unpriviledge container anyway.
>
> No way.
> That's why it's currently a very good idea to mount /proc/ read-only
> into a container.

However it is in the architecture. The problem is that the user
namespace is not finished. Once finished even root with all caps in a
container will have no more permissions than the unprivileged user that
created the user namespace.

Essentially the change is to make permissions checks become a comparison
of the tuple (user_ns, uid) instead of just comparisons by uid. If we
want to fix permission problems with proc and containers please let's
focus on the completing the user namespace.

Eric
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/