Re: [PATCH v2] sysctl: allow CLONE_NEWUSER to be disabled
From: Robert ÅwiÄcki
Date: Thu Jan 28 2016 - 14:11:31 EST
2016-01-28 18:48 GMT+01:00 Eric W. Biederman <ebiederm@xxxxxxxxxxxx>:
> Kees Cook <keescook@xxxxxxxxxxxx> writes:
>
>> + if (sysctl_userns_restrict && !(capable(CAP_SYS_ADMIN) &&
>> + capable(CAP_SETUID) &&
>> + capable(CAP_SETGID)))
>> + return -EPERM;
>> +
>
> I will also note that the way I have seen containers used this check
> adds no security and is not mentioned or justified in any way in your
> patch description.
>
> Furthermore this looks like blame shifting. And quite frankly shifting
> the responsibility to users if they get hacked is not an acceptable
> attitude.
I think I might start understanding your point. Which, if I'm not
mistaken, is that it's not user namespaces which are buggy, but rather
some pieces of the kernel which would otherwise not be reachable from
the typical low-priv level of regular users (e.g. bugs in SOCK_RAW
sockets or iptables or mounts)?
If so, I can agree with such wording, but the proposed sysctl might
still be needed in such case. I guess those bits of the kernel which
were not reachable previously from non-priv users historically got
much less attention in terms of time spent on security reviews and
security fuzzing. And as much as users of the kernel would like to see
those pieces of the kernel to be tested to a level that the attack
surface reachable from unprivileged users level were tested, it will
not happen tomorrow. And our best option now might be to have some
switchable setting to disable this attack surface for those users who
feel they need it. In the meantime, we can concentrate on sec
reviewing those newly reachable kernel APIs, so some day we could
remove this sysctl.