Re: [PATCH 0/2] sysctl: allow CLONE_NEWUSER to be disabled

From: Kees Cook
Date: Mon Jan 25 2016 - 13:56:53 EST


On Mon, Jan 25, 2016 at 10:53 AM, Andy Lutomirski <luto@xxxxxxxxxxxxxx> wrote:
> On Mon, Jan 25, 2016 at 10:51 AM, Kees Cook <keescook@xxxxxxxxxxxx> wrote:
>> On Sun, Jan 24, 2016 at 2:22 PM, Andy Lutomirski <luto@xxxxxxxxxxxxxx> wrote:
>>> On Fri, Jan 22, 2016 at 7:02 PM, Eric W. Biederman
>>> <ebiederm@xxxxxxxxxxxx> wrote:
>>>> Kees Cook <keescook@xxxxxxxxxxxx> writes:
>>>>
>>>>> There continues to be unexpected side-effects and security exposures
>>>>> via CLONE_NEWUSER. For many end-users running distro kernels with
>>>>> CONFIG_USER_NS enabled, there is no way to disable this feature when
>>>>> desired. As such, this creates a sysctl to restrict CLONE_NEWUSER so
>>>>> admins not running containers or Chrome can avoid the risks of this
>>>>> feature.
>>>>
>>>> I don't actually think there do continue to be unexpected side-effects
>>>> and security exposures with CLONE_NEWUSER. It takes a while for all of
>>>> the fixes to trickle out to distros. At most what I have seen recently
>>>> are problems with other kernel interfaces being amplified with user
>>>> namespaces. AKA the current mess with devpts, and the unexpected
>>>> issues with bind mounts in mount namespaces.
>>>>
>>>
>>>>
>>>> So to keep this productive. Please tell me about the threat model
>>>> you envision, and how you envision knobs in the kernel being used to
>>>> counter those threats.
>>>
>>> I consider the ability to use CLONE_NEWUSER to acquire CAP_NET_ADMIN
>>> over /any/ network namespace and to thus access the network
>>> configuration API to be a huge risk. For example, unprivileged users
>>> can program iptables. I'll eat my hat if there are no privilege
>>> escalations in there. (They can't request module loading, but still.)
>>
>> Should I consider this an Ack for the patch? :)
>
> Only if you explain why you need the CAP_SYS_ADMIN check. :)

Hm? In the sysctl write? Because otherwise a non-cap root user could
turn "1" to "0". The restriction on CLONE_NEWUSER checks caps, not
uid, so the uid must be protected by cap checks. The DAC permissions
on sysctls for cap-based restrictions make no sense -- they need to be
doing cap checks not DAC checks. It's the same logic for why
dmesg_restrict and kptr_restrict use the same cap check.

> IOW, I think you could change that one line of code and have a less
> weird version of the patch that would work just fine.

Well, I don't know about less weird, but it would leave a unneeded
hole in the permission checks.

-Kees

--
Kees Cook
Chrome OS & Brillo Security