Re: [PATCH 0/5] RFC: CGroup Namespaces

From: Andy Lutomirski
Date: Tue Jul 29 2014 - 11:09:11 EST

On Mon, Jul 28, 2014 at 9:51 PM, Serge E. Hallyn <serge@xxxxxxxxxx> wrote:
> Quoting Aditya Kali (adityakali@xxxxxxxxxx):
>> Thank you for your review. I have tried to respond to both your emails here.
>> On Thu, Jul 24, 2014 at 9:36 AM, Serge Hallyn <serge.hallyn@xxxxxxxxxx> wrote:
>> > Quoting Aditya Kali (adityakali@xxxxxxxxxx):
>> >> Background
>> >> Cgroups and Namespaces are used together to create âvirtualâ
>> >> containers that isolates the host environment from the processes
>> >> running in container. But since cgroups themselves are not
>> >> âvirtualizedâ, the task is always able to see global cgroups view
>> >> through cgroupfs mount and via /proc/self/cgroup file.
>> >>
>> > Hi,
>> >
>> > A few questions/comments:
>> >
>> > 1. Based on this description, am I to understand that after doing a
>> > cgroupns unshare, 'mount -t cgroup cgroup /mnt' by default will
>> > still mount the global root cgroup? Any plans on "changing" that?
>> This is suggested in the "Possible Extensions of CGROUPNS" section.
>> More details below.
>> > Will attempts to change settings of a cgroup which is not under
>> > our current ns be rejected? (That should be easy to do given your
>> > patch 1/5). Sorry if it's done in the set, I'm jumping around...
>> >
>> Currently, only 'cgroup_attach_task', 'cgroup_mkdir' and
>> 'cgroup_rmdir' of cgroups outside of cgroupns-root are prevented. The
>> read/write of actual cgroup properties are not prevented. Usual
>> permission checks continue to apply for those. I was hoping that
>> should be enough, but see more comments towards the end.
>> > 2. What would be the reprecussions of allowing cgroupns unshare so
>> > long as you have ns_capable(CAP_SYS_ADMIN) to the user_ns which
>> > created your current ns cgroup? It'd be a shame if that wasn't
>> > on the roadmap.
>> >
>> Its certainly on the roadmap, just that some logistics were not clear
>> at this time. As pointed out by Andy Lutomirski on [PATCH 5/5] of this
>> series, if we allow cgroupns creation to ns_capable(CAP_SYS_ADMIN)
>> processes, we may need some kind of explicit permission from the
>> cgroup subsystem to allow this. One approach could be an explicit
> So long as you do ns_capable(cgroup_ns->user_ns, CAP_SYS_ADMIN) I think
> you're fine.
> The only real problem I can think of with unsharing a cgroup_ns is that
> you could lock a setuid-root application someplace it wasn't expecting.
> The above check guarantees that you were privileged enough that you'd
> better be trusted in this user namespace.
> (Unless there is some possible interaction I'm overlooking)

I think that, if it's done this way, you'd have to unshare cgroupns
before unsharing userns, since you forfeit that capability when you
unshare your userns. That means that the new cgroupns ends up being
associated w/ the root userns, which may not be what you want.

You could unshare both namespaces in one syscall and give that some
magic semantics, but that's kind of weird. It would be nice if you
could unshare your userns and temporarily retains caps in the parent,
but there is no such mechanism right now.

To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at
Please read the FAQ at