Re: [PATCH v4 0/4] Introduce security_create_user_ns()
From: Serge E. Hallyn
Date: Mon Aug 15 2022 - 11:41:11 EST
On Sun, Aug 14, 2022 at 10:32:51PM -0400, Paul Moore wrote:
> On Sun, Aug 14, 2022 at 11:55 AM Serge E. Hallyn <serge@xxxxxxxxxx> wrote:
> > On Mon, Aug 08, 2022 at 03:16:16PM -0400, Paul Moore wrote:
> > > On Mon, Aug 8, 2022 at 2:56 PM Eric W. Biederman <ebiederm@xxxxxxxxxxxx> wrote:
> > > > Paul Moore <paul@xxxxxxxxxxxxxx> writes:
> > > > > On Mon, Aug 1, 2022 at 10:56 PM Eric W. Biederman <ebiederm@xxxxxxxxxxxx> wrote:
> > > > >> Frederick Lawler <fred@xxxxxxxxxxxxxx> writes:
> > > > >>
> > > > >> > While creating a LSM BPF MAC policy to block user namespace creation, we
> > > > >> > used the LSM cred_prepare hook because that is the closest hook to prevent
> > > > >> > a call to create_user_ns().
> > > > >>
> > > > >> Re-nack for all of the same reasons.
> > > > >> AKA This can only break the users of the user namespace.
> > > > >>
> > > > >> Nacked-by: "Eric W. Biederman" <ebiederm@xxxxxxxxxxxx>
> > > > >>
> > > > >> You aren't fixing what your problem you are papering over it by denying
> > > > >> access to the user namespace.
> > > > >>
> > > > >> Nack Nack Nack.
> > > > >>
> > > > >> Stop.
> > > > >>
> > > > >> Go back to the drawing board.
> > > > >>
> > > > >> Do not pass go.
> > > > >>
> > > > >> Do not collect $200.
> > > > >
> > > > > If you want us to take your comments seriously Eric, you need to
> > > > > provide the list with some constructive feedback that would allow
> > > > > Frederick to move forward with a solution to the use case that has
> > > > > been proposed. You response above may be many things, but it is
> > > > > certainly not that.
> > > >
> > > > I did provide constructive feedback. My feedback to his problem
> > > > was to address the real problem of bugs in the kernel.
> > >
> > > We've heard from several people who have use cases which require
> > > adding LSM-level access controls and observability to user namespace
> > > creation. This is the problem we are trying to solve here; if you do
> > > not like the approach proposed in this patchset please suggest another
> > > implementation that allows LSMs visibility into user namespace
> > > creation.
> >
> > Regarding the observability - can someone concisely lay out why just
> > auditing userns creation would not suffice? Userspace could decide
> > what to report based on whether the creating user_ns == /proc/1/ns/user...
>
> One of the selling points of the BPF LSM is that it allows for various
> different ways of reporting and logging beyond audit. However, even
> if it was limited to just audit I believe that provides some useful
> justification as auditing fork()/clone() isn't quite the same and
> could be difficult to do at scale in some configurations. I haven't
> personally added a BPF LSM program to the kernel so I can't speak to
> the details on what is possible, but I'm sure others on the To/CC line
> could help provide more information if that is important to you.
>
> > Regarding limiting the tweaking of otherwise-privileged code by
> > unprivileged users, i wonder whether we could instead add smarts to
> > ns_capable().
>
> The existing security_capable() hook is eventually called by ns_capable():
>
> ns_capable()
> ns_capable_common()
> security_capable(const struct cred *cred,
> struct user_namespace *ns,
> int cap,
> unsigned int opts);
>
> ... I'm not sure what additional smarts would be useful here?
Oh - i wasn't necessarily thinking of an LSM. I was picturing a
sysctl next to unprivileged_userns_clone. But you're right, looks
like an LSM could already do this. Of course, there's an issue early
on in that the root user in the new namespace couldn't setuid, so
the uid mapping is still limited. So this idea probably isn't worth
the characters we've typed about it so far, sorry.
> [side note: SELinux does actually distinguish between capability
> checks in the initial user namespace vs child namespaces]
>
> > Point being, uid mapping would still work, but we'd
> > break the "privileged against resources you own" part of user
> > namespaces. I would want it to default to allow, but then when a
> > 0-day is found which requires reaching ns_capable() code, admins
> > could easily prevent exploitation until reboot from a fixed kernel.
>
> That assumes that everything you care about is behind a capability
> check, which is probably going to be correct in a lot of the cases,
> but I think it would be a mistake to assume that is always going to be
> true.
I might be thinking wrongly, but if it's not behind a capability check,
then it seems to me it's not an exploit that can only be reached by
becoming root in a user namespace, which means disabling user namespace
creation by unprivileged users will not stop the attack.