Re: [RFC][PATCH 1/5] memcg: change for softlimit.

From: Balbir Singh
Date: Fri Aug 28 2009 - 09:26:58 EST


* KAMEZAWA Hiroyuki <kamezawa.hiroyu@xxxxxxxxxxxxxx> [2009-08-28 16:35:23]:

> On Fri, 28 Aug 2009 12:50:08 +0530
> Balbir Singh <balbir@xxxxxxxxxxxxxxxxxx> wrote:
>
> > * KAMEZAWA Hiroyuki <kamezawa.hiroyu@xxxxxxxxxxxxxx> [2009-08-28 13:23:21]:
> >
> > > This patch tries to modify softlimit handling in memcg/res_counter.
> > > There are 2 reasons in general.
> > >
> > > 1. soft_limit can use only against sub-hierarchy root.
> > > Because softlimit tree is sorted by usage, putting prural groups
> > > under hierarchy (which shares usage) will just adds noise and unnecessary
> > > mess. This patch limits softlimit feature only to hierarchy root.
> > > This will make softlimit-tree maintainance better.
> > >
> > > 2. In these days, it's reported that res_counter can be bottleneck in
> > > massively parallel enviroment. We need to reduce jobs under spinlock.
> > > The reason we check softlimit at res_counter_charge() is that any member
> > > in hierarchy can have softlimit.
> > > But by chages in "1", only hierarchy root has soft_limit. We can omit
> > > hierarchical check in res_counter.
> > >
> > > After this patch, soft limit is avaliable only for root of sub-hierarchy.
> > > (Anyway, softlimit for hierarchy children just makes users confused, hard-to-use)
> > >
> >
> >
> > I need some time to digest this change, if the root is a hiearchy root
> > then only root can support soft limits? I think the change makes it
> > harder to use soft limits. Please help me understand better.
> >
> I poitned out this issue many many times while you wrote patch.
>
> memcg has "sub tree". hierarchy here means "sub tree" with use_hierarchy =1.
>
> Assume
>
>
> /cgroup/Users/use_hierarchy=0
> Gold/ use_hierarchy=1
> Bob
> Mike
> Silver/use_hierarchy=1
>
> /System/use_hierarchy=1
>
> In flat, there are 3 sub trees.
> /cgroup/Users/Gold (Gold has /cgroup/Users/Gold/Bog, /cgroup/Users/Gold/Mike)
> /cgroup/Users/Silver .....
> /cgroup/System .....
>
> Then, subtrees means a group which inherits charges by use_hierarchy=1
>
> In current implementation, softlimit can be set to arbitrary cgroup.
> Then, following ops are allowed.
> ==
> /cgroup/Users/Gold softlimit= 1G
> /cgroup/Users/Gold/Bob softlimit=800M
> /cgroup/Users/Gold/Mike softlimit=800M
> ==
>
> Then, how your RB-tree for softlimit management works ?
>
> When softlimit finds /cgroup/Users/Gold/, it will reclaim memory from
> all 3 groups by hierarchical_reclaim. If softlimit finds
> /cgroup/Users/Gold/Bob, reclaim from Bob means recalaim from Gold.

By reclaim from Bob means reclaim from Gold, are you referring to the
uncharging part, if so yes. But if you look at the tasks part, we
don't reclaim anything from the tasks in Gold.

>
> Then, to keep the RB-tree neat, you have to extract all related cgroups and
> re-insert them all, every time.
> (But current code doesn't do that. It's broken.)

The earlier time dependent code used to catch that, since it was time
based. Now that it is based on activity, it will take a while before
the group is updated. I don't think it is broken, but updates can take
a lag before showing up.

>
> Current soft-limit RB-tree will be easily broken i.e. not-sorted correctly
> if used under use_hierarchy=1.
>

Not true, I think the sorted-ness is delayed and is seen when we pick
a tree for reclaim. Think of it as being lazy :)

> My patch disallows set softlimit to Bob and Mike, just allows against Gold
> because there can be considered as the same class, hierarchy.
>

But Bob and Mike might need to set soft limits between themselves. if
soft limit of gold is 1G and bob needs to be close to 750M and mike
250M, how do we do it without supporting what we have today?

--
Balbir
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/