Re: [PATCH v4 0/2] cgroup: allow management of subtrees by new cgroup namespaces

From: Tejun Heo
Date: Fri May 20 2016 - 13:49:50 EST


Hello, James.

On Fri, May 20, 2016 at 01:28:46PM -0400, James Bottomley wrote:
> Just so I'm clear: by delegation you mean create a subdirectory in the
> cgroup hierarchy with a non-root owner? We may have a solution for the
> escape constraints problem: see below.

Yeah, there's delegation section in cgroup-v2.txt.

> > Unfortunately, cgroup hierarchy isn't designed to support this sort
> > of automatic delegation. Unpriv processes would be able to escape
> > constraints on v1 with some controllers and on v2 controllers have to
> > be explicitly enabled by root for delegated scope to have access to
> > them.
>
> Not necessarily. We also talked about pinning the cgroup tree so that
> once you enter the cgroup namespace, your current cgroup directory
> becomes your root, meaning you can't cd back into the ancestors and
> thus can't write their tasks file, meaning, I think, that it should be
> impossible to escape ancestor constraints.

I wish it were that clean. Unfortunately, on v1, some controllers
(memory and blkio depending on settings, netcls and netprio always)
simply aren't properly hierarchical and if you have write perm to
subdirectory you can escape the constraints of your ancestors.
Whether you can cd back up or not doesn't matter at all, so we can't
allow delegation by default.

> > Why does an unpriv NS need to have cgroup delegated to it without
> > cooperation from cgroup manager?
>
> There's actually many answers to this. The one I'm insterested in is
> the ability for applications to make use of container features without
> having to ask permission from some orchestration engine. The problem

What's "container features"? Do you mean resource control by that?

> most people are looking at is how do I prevent the cgroup manager from
> running as root, because that's a security problem waiting to happen.

It's distributing system wide resources so the top of the tree will
always be owned by root and delegating subtrees is a fairly minimal
operation. I don't see how that would necessarily lead to security
problems.

> > If for resource control, I'm pretty sure we don't want to allow
> > that without explicit cooperation from the enclosing scope.
>
> The enclosing scope should be allowed to define the parameters (happens
> today with namespaces) but there shouldn't be an active "thing" which
> is the permission gateway.

It's not that I fundamentally disagree that that'd be nice to have
but, given the way cgroup is designed and implemented currently, I'm
not sure this is a feasible goal.

Thanks.

--
tejun