Re: [kernel-hardening] Re: [PATCH resend 2/2] userns: control capabilities of some user namespaces
From: Daniel Micay
Date: Mon Nov 06 2017 - 21:16:14 EST
On Mon, 2017-11-06 at 16:14 -0600, Serge E. Hallyn wrote:
> Quoting Daniel Micay (danielmicay@xxxxxxxxx):
> > Substantial added attack surface will never go away as a problem.
> > There
> > aren't a finite number of vulnerabilities to be found.
>
> There's varying levels of usefulness and quality. There is code which
> I
> want to be able to use in a container, and code which I can't ever see
> a
> reason for using there. The latter, especially if it's also in a
> staging driver, would be nice to have a toggle to disable.
>
> You're not advocating dropping the added attack surface, only adding a
> way of dealing with an 0day after the fact. Privilege raising 0days
> can
> exist anywhere, not just in code which only root in a user namespace
> can
> exercise. So from that point of view, ksplice seems a more complete
> solution. Why not just actually fix the bad code block when we know
> about it?
That's not what I'm advocating. I only care about it for proactive
attack surface reduction downstream. I have no interest in using it to
block access to known vulnerabilities.
> Finally, it has been well argued that you can gain many new caps from
> having only a few others. Given that, how could you ever be sure
> that,
> if an 0day is found which allows root in a user ns to abuse
> CAP_NET_ADMIN against the host, just keeping CAP_NET_ADMIN from them
> would suffice?
I didn't suggest using it that way...
> It seems to me that the existing control in
> /proc/sys/kernel/unprivileged_userns_clone might be the better duct
> tape
> in that case.
There's no such thing as unprivileged_userns_clone in mainline.
The advantage of this over unprivileged_userns_clone in Debian and maybe
some other distributions is not giving up unprivileged app containers /
sandboxes implemented via user namespaces. For example, Chromium's user
namespace sandbox likely only needs to have CAP_SYS_CHROOT. Chromium
will be dropping their setuid sandbox, forcing usage of user namespaces
to avoid losing the sandbox which will greatly increase local kernel
attack surface on the host by exposing netfilter management, etc. to
unprivileged users.
The proposed approach isn't necessarily the best way to implement this
kind of mitigation but I think it's filling a real need.