Re: [lxc-devel] Kernel bug? Setuid apps and user namespaces

From: Andy Lutomirski
Date: Fri Apr 04 2014 - 15:03:34 EST


On Fri, Apr 4, 2014 at 11:30 AM, Serge Hallyn <serge.hallyn@xxxxxxxxxx> wrote:
> Quoting Andy Lutomirski (luto@xxxxxxxxxxxxxx):
>> On 04/02/2014 10:32 AM, Serge E. Hallyn wrote:
>> > (Sorry - the lxc-devel list has moved, so replying to all with the
>> > correct list address; please reply to this rather than my previous
>> > email)
>> >
>> > Quoting Serge Hallyn (serge.hallyn@xxxxxxxxxx):
>> >> Hi Eric,
>> >>
>> >> (sorry, I don't seem to have the email I actually wanted to reply
>> >> to in my mbox, but it is
>> >> https://lists.linuxcontainers.org/pipermail/lxc-devel/2013-October/005857.html)
>> >>
>> >> You'd said,
>> >>> Someone needs to read and think through all of the corner cases and see
>> >>> if we can ever have a time when task_dumpable is false but root in the
>> >>> container would not or should not be able to see everything.
>> >>>
>> >>> In particular I am worried about the case of a setuid app calling setns,
>> >>> and entering a lesser privileged user namespace. In my foggy mind that
>> >>> might be a security problem. And there might be other similar crazy
>> >>> cases.
>> >>
>> >> Can we make use of current->mm->exe_file->f_cred->user_ns?
>> >>
>> >> So either always use
>> >> make_kgid(current->mm->exe_file->f_cred->user_ns, 0)
>> >> instead of make_kuid(cred->user_ns, 0), or check that
>> >> (current->mm->exe_file->f_cred->user_ns == cred->user_ns)
>> >> and, if not, assume that the caller has done a setns?
>>
>> Do you have a summary of the issue? I'm a little lost here.
>
> Sure - when running an unprivileged container, tasks which become
> !dumpable end up with /proc/$pid/fd/ being owned by the global
> root user, which inside the container is nobody:nogroup. Examples
> are the user's sshd threads and apache, and in the past I think I've
> seen it with logind or getty too.

Other than the aesthetics, why does this matter? Things in the
container who are actually mapped to nobody still can't access those
files?

The alternative (using the container's owner) sounds a bit scary.

>
>> I suspect that what we really need is to revoke a bunch of proc files
>> every time a task does anything involving setuid (or, more generally,
>> any of the LSM_UNSAFE_PTRACE things).
>
> setuid, or do you mean setns? In any case, I'm not thinking through
> attach (setns'ing into a container) yet, but the cases I'm looking at
> right now are just a root daemon - already inside the non-init user
> ns - doing something to become !dumpable, and having its fds become
> owned by GLOBAL_ROOT_UID. Since these tasks are running a program
> which came from inside the non-init userns, I think it's sane to
> allow root in the non-init userns own any coredumps.
>
> Whereas if the program had started as /bin/passwd in the init userns,
> then coredumps (and /proc/$$/fd/*) should be owned by the GLOBAL_ROOT_UID.

Gack.

This is kind of the same problem as the ptrace issue in the credfd
thread. Sigh.

--Andy
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/