Re: [PATCH v4 0/2] cgroup: allow management of subtrees by new cgroup namespaces

From: Aditya Kali
Date: Fri May 20 2016 - 13:33:30 EST


On Fri, May 20, 2016 at 9:25 AM, James Bottomley
<James.Bottomley@xxxxxxxxxxxxxxxxxxxxx> wrote:
>
> On Fri, 2016-05-20 at 09:17 -0700, Tejun Heo wrote:
> > Hello, James.
> >
> > On Fri, May 20, 2016 at 12:09:10PM -0400, James Bottomley wrote:
> > > I think it's just different definitions. If you take on our
> > > definition of being able to set up a container without any admin
> > > intervention, do you see our problem: we can't get the initial
> > > delegation of the hierarchy.
> >
> > Yeah, I can see the difference but we can't solve that by special
> > casing NS case.
>
> Great, we agree on the problem definition ... as I said, I'm not saying
> this patch is the solution, but it gives us a starting point for
> exploring whether there is a solution.
>
> > This is stemming from the fact that an unpriv application can't
> > create its sub-cgroups without explicit delegation from the root and
> > that has always been an explicit design choice.
> > It's tied to who's responsible for cleanup afterwards and what
> > happens when the process gets migrated to a different cgroup. The
> > latter is an important issue on v1 hierarchies because migrating
> > tasks sometimes is used as a way to control resource distribution.
>
> OK, so is the only problem cleanup? If so, what if I proposed that a
> cgroup directory could only be created by the owner of the userns
> (which would be any old unprivileged user) iff they create a cgroup ns
> and the cgroup ns would be responsible for removing it again, so the
> cgroup subdirectory would be tied to the cgroup namespace as its holder
> and we'd use release of the cgroup to remove all the directories?
>

cgroup namspace doesn't own the resources in the cgroupns-root, and so
I am not sure how it will be able to do the cleanup either. I.e, even
if all the processes in the cgroup ns die, it doesn't mean that the
cgroupns-root they belonged to is available for cleanup. For this
reason, one of the implicit design choice in cgroupns was that the
cgroup-ns root should already exist and the target process should
already be moved to it (presumably by some admin process) before
creating the cgroupns.

Moreover, the subsystem controllers (cpu, memory, etc.) are oblivious
to cgroup namespaces. So, for example, creating new cgroup namespace
doesn't affect the reclaim behavior. But, allowing
creation/modification of sub-cgroups affects it. So I think allowing
any unprivileged process to do that cannot be considered safe for now.
Explicit approval from some admin process will still be needed (which
can be given by chmod/chown today).


>
> James
>
> --
> To unsubscribe from this list: send the line "unsubscribe cgroups" in
> the body of a message to majordomo@xxxxxxxxxxxxxxx
> More majordomo info at http://vger.kernel.org/majordomo-info.html


Thanks,

--
Aditya