Re: [PATCH 0/2] sysctl: allow CLONE_NEWUSER to be disabled

From: Eric W. Biederman
Date: Fri Jan 22 2016 - 22:13:24 EST

Kees Cook <keescook@xxxxxxxxxxxx> writes:

> There continues to be unexpected side-effects and security exposures
> via CLONE_NEWUSER. For many end-users running distro kernels with
> CONFIG_USER_NS enabled, there is no way to disable this feature when
> desired. As such, this creates a sysctl to restrict CLONE_NEWUSER so
> admins not running containers or Chrome can avoid the risks of this
> feature.

I don't actually think there do continue to be unexpected side-effects
and security exposures with CLONE_NEWUSER. It takes a while for all of
the fixes to trickle out to distros. At most what I have seen recently
are problems with other kernel interfaces being amplified with user
namespaces. AKA the current mess with devpts, and the unexpected
issues with bind mounts in mount namespaces.

I have a couple of concerns with a sysctl.

1) As user namespaces settle out this sysctl has the potential to
decrease the security of the system overall as sandboxing
features of the kernel will not be available to unprivileged

Web browsing with chrome will be less safe for example.

2) I strongly suspect the granularity of a sysctl is wrong for access to
user namespaces on a production system.

In general I suspect what we want is something like seccomp. I
believe all of the relevant bits are in registers. I actually
thought that was enough for seccomp. Does seccomp not work for
some reason?

3) A sysctl breeds a false sense of security in thinking that if a
security issue is discovered you can just flip a switch, disable
all new user namespaces and you won't be vulnerable.

In fact most of the issues in the past have only required being in
a user namespace to trigger. Which means any containers or user
namespaces that already exist could be used to exploit any new
found issue. Which means that a I don't think a sysctl will give
the desired level of protection.

In my analysis of the issues to date I don't know of anything
short of a reboot that would meaninfully remove the threat.

4) With applications like docker coming on-line I don't think a
restriction to processes with capabilities is actually meaninful
for restricting access to user namespaces.

So I have concerns about both efficacy and usability with the proposed

So to keep this productive. Please tell me about the threat model
you envision, and how you envision knobs in the kernel being used to
counter those threats.