Re: [PATCHv1 7/8] cgroup: cgroup namespace setns support

From: Eric W. Biederman
Date: Sun Oct 19 2014 - 01:24:43 EST


"Serge E. Hallyn" <serge@xxxxxxxxxx> writes:

> Quoting Aditya Kali (adityakali@xxxxxxxxxx):
>> On Thu, Oct 16, 2014 at 2:12 PM, Serge E. Hallyn <serge@xxxxxxxxxx> wrote:
>> > Quoting Aditya Kali (adityakali@xxxxxxxxxx):
>> >> setns on a cgroup namespace is allowed only if
>> >> * task has CAP_SYS_ADMIN in its current user-namespace and
>> >> over the user-namespace associated with target cgroupns.
>> >> * task's current cgroup is descendent of the target cgroupns-root
>> >> cgroup.
>> >
>> > What is the point of this?
>> >
>> > If I'm a user logged into
>> > /lxc/c1/user.slice/user-1000.slice/session-c12.scope and I start
>> > a container which is in
>> > /lxc/c1/user.slice/user-1000.slice/session-c12.scope/x1
>> > then I will want to be able to enter the container's cgroup.
>> > The container's cgroup root is under my own (satisfying the
>> > below condition0 but my cgroup is not a descendent of the
>> > container's cgroup.
>> >
>> This condition is there because we don't want to do implicit cgroup
>> changes when a process attaches to another cgroupns. cgroupns tries to
>> preserve the invariant that at any point, your current cgroup is
>> always under the cgroupns-root of your cgroup namespace. But in your
>> example, if we allow a process in "session-c12.scope" container to
>> attach to cgroupns root'ed at "session-c12.scope/x1" container
>> (without implicitly moving its cgroup), then this invariant won't
>> hold.
>
> Oh, I see. Guess that should be workable. Thanks.

Which has me looking at what the rules are for moving through
the cgroup hierarchy.

As long as we have write access to cgroup.procs and are allowed
to open the file for write, we can move any of our own tasks
into the cgroup. So the cgroup namespace rules don't seem
to be a problem.

Andy can you please take a look at the permission checks in
__cgroup_procs_write.

As I read the code I see 3 security gaffaws in the permssion check.
- Using current->cred instead of file->f_cred.
- Not checking tcred->euid.
- Checking GLOBAL_ROOT_UID instead of having a capable call.

The file permission on cgroup.procs seem just sufficient to keep
to keep those bugs from being easily exploitable.

Eric
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/