Re: [RFC] cgroup TODOs

From: Tejun Heo
Date: Fri Sep 14 2012 - 13:53:15 EST


Hello, Peter.

On Fri, Sep 14, 2012 at 01:15:02PM +0200, Peter Zijlstra wrote:
> On Thu, 2012-09-13 at 13:58 -0700, Tejun Heo wrote:
> > The cpu ones handle nesting correctly - parent's accounting includes
> > children's, parent's configuration affects children's unless
> > explicitly overridden, and children's limits nest inside parent's.
>
> The implementation has some issues with fixed point math limitations on
> deep hierarchies/large cpu count, but yes.
>
> Doing soft-float/bignum just isn't going to be popular I guess ;-)

As things currently stand, I think the cpu stuff is high enough bar to
aim for. That said, I do have some problems with how it handles tasks
vs. groups. Will talk about in another reply.

> People also don't seem to understand that each extra cgroup carries a
> cost and that nested cgroups are more expensive still, even if the
> intermediate levels are mostly empty (libvirt is a good example of how
> not to do things).
>
> Anyway, I guess what I'm saying is that we need to work on the awareness
> of cost associated with all this cgroup nonsense, people seem to think
> its all good and free -- or not think at all, which, while depressing,
> seem the more likely option.

The decision may not have been conscious but it seems that we settled
on the direction where cgroup does more hierarchy-wise rather than
leaving non-scalable operations to each use case - e.g. filesystem
trees are very scalable but for that they give up a lot of tree-aware
things like knowing the size of a given subtree.

For what cgroup does, I think the naturally chosen direction is the
right one. Its functionality inherently requires more involvement
with the tree structure and we of course should try to document the
implications clearly and make things scale better where we can
(e.g. stat propagation has no reason to happen on every update).

Thanks.

--
tejun
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/