Re: 3.13-rc breaks MEMCG_SWAP
From: Hugh Dickins
Date: Mon Dec 16 2013 - 20:42:26 EST
On Mon, 16 Dec 2013, Tejun Heo wrote:
> On Mon, Dec 16, 2013 at 06:19:37PM +0100, Michal Hocko wrote:
> > I have to think about it some more (the brain is not working anymore
> > today). But what we really need is that nobody gets the same id while
> > the css is alive. So css_from_id returning NULL doesn't seem to be
> > enough.
>
> Oh, I meant whether it's necessary to keep css_from_id() working
> (ie. doing successful lookups) between offline and release, because
> that's where lifetimes are coupled. IOW, if it's enough for cgroup to
> not recycle the ID until all css's are released && fail css_from_id()
> lookup after the css is offlined, I can make a five liner quick fix.
Don't take my word on it, I'm too fuzzy on this: but although it would
be good to refrain from recycling the ID until all css's are released,
I believe that it would not be good enough to fail css_from_id() once
the css is offlined - mem_cgroup_uncharge_swap() needs to uncharge the
hierarchy of the dead memcg (for example, when tmpfs file is removed).
Uncharging the dead memcg itself is presumably irrelevant, but it does
need to locate the right parent to uncharge, and NULL css_from_id()
would make that impossible. It would be easy if we said those charges
migrate to root rather than to parent, but that's inconsistent with
what we have happily converged upon doing elsewhere (in the preferred
use_hierarchy case), and it would be a change in behaviour.
I'm not nearly as enthusiastic for my patch as Michal is: I really
would prefer a five-liner from you or from Zefan. I do think (and
this is probably what Michal likes) that my patch leaves MEMCG_SWAP
less surprising, and less likely to cause similar trouble in future;
but it's not how Kame chose to implement it, and it has those nasty
swap_cgroup array scans adding to the overhead of memcg removal -
we can layer on several different hacks/optimizations to reduce that
overhead, but I think it's debatable whether that will end up as an
improvement over what we have had until now.
Hugh
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/