Re: [lxc-devel] Kernel bug? Setuid apps and user namespaces

From: Serge E. Hallyn
Date: Mon Apr 07 2014 - 14:13:43 EST


Quoting Andy Lutomirski (luto@xxxxxxxxxxxxxx):
> On Fri, Apr 4, 2014 at 12:10 PM, Serge Hallyn <serge.hallyn@xxxxxxxxxx> wrote:
> > Quoting Andy Lutomirski (luto@xxxxxxxxxxxxxx):
> >> On Fri, Apr 4, 2014 at 11:30 AM, Serge Hallyn <serge.hallyn@xxxxxxxxxx> wrote:
> >> > Quoting Andy Lutomirski (luto@xxxxxxxxxxxxxx):
> >> >> On 04/02/2014 10:32 AM, Serge E. Hallyn wrote:
> >> >> > (Sorry - the lxc-devel list has moved, so replying to all with the
> >> >> > correct list address; please reply to this rather than my previous
> >> >> > email)
> >> >> >
> >> >> > Quoting Serge Hallyn (serge.hallyn@xxxxxxxxxx):
> >> >> >> Hi Eric,
> >> >> >>
> >> >> >> (sorry, I don't seem to have the email I actually wanted to reply
> >> >> >> to in my mbox, but it is
> >> >> >> https://lists.linuxcontainers.org/pipermail/lxc-devel/2013-October/005857.html)
> >> >> >>
> >> >> >> You'd said,
> >> >> >>> Someone needs to read and think through all of the corner cases and see
> >> >> >>> if we can ever have a time when task_dumpable is false but root in the
> >> >> >>> container would not or should not be able to see everything.
> >> >> >>>
> >> >> >>> In particular I am worried about the case of a setuid app calling setns,
> >> >> >>> and entering a lesser privileged user namespace. In my foggy mind that
> >> >> >>> might be a security problem. And there might be other similar crazy
> >> >> >>> cases.
> >> >> >>
> >> >> >> Can we make use of current->mm->exe_file->f_cred->user_ns?
> >> >> >>
> >> >> >> So either always use
> >> >> >> make_kgid(current->mm->exe_file->f_cred->user_ns, 0)
> >> >> >> instead of make_kuid(cred->user_ns, 0), or check that
> >> >> >> (current->mm->exe_file->f_cred->user_ns == cred->user_ns)
> >> >> >> and, if not, assume that the caller has done a setns?
> >> >>
> >> >> Do you have a summary of the issue? I'm a little lost here.
> >> >
> >> > Sure - when running an unprivileged container, tasks which become
> >> > !dumpable end up with /proc/$pid/fd/ being owned by the global
> >> > root user, which inside the container is nobody:nogroup. Examples
> >> > are the user's sshd threads and apache, and in the past I think I've
> >> > seen it with logind or getty too.
> >>
> >> Other than the aesthetics, why does this matter? Things in the
> >> container who are actually mapped to nobody still can't access those
> >> files?
> >
> > Bc root cannot look at the fds.
>
> Right. I guess this is a problem.
>
> >
> >> The alternative (using the container's owner) sounds a bit scary.
> >
> > If the file being run belongs to the container, why would it be scary?
> > Bc some fds may have been not closed when the task did execve, where
> > the previous bprm file may have been on the host?
>
> Meh. I'm not worried about that case, and that one probably doesn't
> cause !dumpable anyway. The nasty cases are unshare and setns.
>
> I'm starting to think that we need to extend dumpable to something
> much more general like a list of struct creds that someone needs to be
> able to ptrace, *in addition to current creds* in order to access
> sensitive /proc files, coredumps, etc. If you get started as setuid,

Hm, yeah, this sort of makes sense.

> then you start with two struct creds in the list (or maybe just your
> euid and uid). If you get started !setuid, then your initial creds
> are in the list. It's possible that few or no things will need to
> change that list after execve.
>
> If all of the entries and current->cred are in the same user_ns, then
> we can dump as userns root. If they're in different usernses, then we
> dump as global root or maybe the common ancestor root.
> setuid(getuid()) and other such nastiness may have to empty the list,
> or maybe we can just use a prctl for that.

A few questions,

1. is there any other action which would trigger adding a new cred to
the ist?

2. would execve clear (and re-init) the list of creds?

> If this idea works, it would be straightforward to implement, it might
> solve a number of problems.
>
> --Andy
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/