Re: [PATCH -mm 0/8] memcg: reparent kmem on css offline

From: Vladimir Davydov
Date: Wed Jul 09 2014 - 03:26:21 EST

On Tue, Jul 08, 2014 at 06:05:19PM -0400, Johannes Weiner wrote:
> On Mon, Jul 07, 2014 at 07:40:08PM +0400, Vladimir Davydov wrote:
> > On Mon, Jul 07, 2014 at 10:25:06AM -0400, Johannes Weiner wrote:
> > > You could then reap dead slab caches as part of the regular per-memcg
> > > slab scanning in reclaim, without having to resort to auxiliary lists,
> > > vmpressure events etc.
> >
> > Do you mean adding a per memcg shrinker that will call kmem_cache_shrink
> > for all memcg caches on memcg/global pressure?
> >
> > Actually I recently made dead caches self-destructive at the cost of
> > slowing down kfrees to dead caches (see
> >, it's already in the mmotm tree) so
> > no dead cache reaping is necessary. Do you think if we need it now?
> >
> > > I think it would save us a lot of code and complexity. You want
> > > per-memcg slab scanning *anyway*, all we'd have to change in the
> > > existing code would be to pin the css until the LRUs and kmem caches
> > > are truly empty, and switch mem_cgroup_iter() to css_tryget().
> > >
> > > Would this make sense to you?
> >
> > Hmm, interesting. Thank you for such a thorough explanation.
> >
> > One question. Do we still need to free mem_cgroup->kmemcg_id on css
> > offline so that it can be reused by new kmem-active cgroups (currently
> > we don't)?
> >
> > If we won't free it the root_cache->memcg_params->memcg_arrays may
> > become really huge due to lots of dead css holding the id.
> We only need the O(1) access of the array for allocation - not frees
> and reclaim, right?


> So with your self-destruct code, can we prune caches of dead css and
> then just remove them from the array? Or move them from the array to
> a per-memcg linked list that can be scanned on memcg memory pressure?

This shouldn't be a problem. Will do that.

Actually, I now doubt if we need self-destruct at all. I don't really
like it, because its implementations is rather ugly, and, what is worse,
it slows down kfree for dead caches noticeably. SLAB maintainers doesn't
seem to be fond of it either. May be, we'd better drop this in favour of
shrinking dead caches on memory pressure?

Then *empty* dead caches will be pending until memory pressure reaps
them, which looks a bit strange, because there's absolutely no reason to
keep them for so long. However, the code will be simpler then, and
kfrees to dead caches will proceed at the same speed as to active

To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at
Please read the FAQ at