Re: [PATCH] memcg: Remove memcg_cgroup::id from IDR on mem_cgroup_css_alloc() failure
From: Michal Hocko
Date: Fri Apr 13 2018 - 07:39:03 EST
On Fri 13-04-18 14:29:11, Kirill Tkhai wrote:
> On 13.04.2018 14:20, Michal Hocko wrote:
> > On Fri 13-04-18 14:06:40, Kirill Tkhai wrote:
> >> On 13.04.2018 14:02, Michal Hocko wrote:
> >>> On Fri 13-04-18 12:35:22, Kirill Tkhai wrote:
> >>>> On 13.04.2018 11:55, Michal Hocko wrote:
> >>>>> On Thu 12-04-18 17:52:04, Kirill Tkhai wrote:
> >>>>> [...]
> >>>>>> @@ -4471,6 +4477,7 @@ mem_cgroup_css_alloc(struct cgroup_subsys_state *parent_css)
> >>>>>>
> >>>>>> return &memcg->css;
> >>>>>> fail:
> >>>>>> + mem_cgroup_id_remove(memcg);
> >>>>>> mem_cgroup_free(memcg);
> >>>>>> return ERR_PTR(-ENOMEM);
> >>>>>> }
> >>>>>
> >>>>> The only path which jumps to fail: here (in the current mmotm tree) is
> >>>>> error = memcg_online_kmem(memcg);
> >>>>> if (error)
> >>>>> goto fail;
> >>>>>
> >>>>> AFAICS and the only failure path in memcg_online_kmem
> >>>>> memcg_id = memcg_alloc_cache_id();
> >>>>> if (memcg_id < 0)
> >>>>> return memcg_id;
> >>>>>
> >>>>> I am not entirely clear on memcg_alloc_cache_id but it seems we do clean
> >>>>> up properly. Or am I missing something?
> >>>>
> >>>> memcg_alloc_cache_id() may allocate a lot of memory, in case of the system reached
> >>>> memcg_nr_cache_ids cgroups. In this case it iterates over all LRU lists, and double
> >>>> size of every of them. In case of memory pressure it can fail. If this occurs,
> >>>> mem_cgroup::id is not unhashed from IDR and we leak this id.
> >>>
> >>> OK, my bad I was looking at the bad code path. So you want to clean up
> >>> after mem_cgroup_alloc not memcg_online_kmem. Now it makes much more
> >>> sense. Sorry for the confusion on my end.
> >>>
> >>> Anyway, shouldn't we do the thing in mem_cgroup_free() to be symmetric
> >>> to mem_cgroup_alloc?
> >>
> >> We can't, since it's called from mem_cgroup_css_free(), which doesn't have a deal
> >> with idr freeing. All the asymmetry, we see, is because of the trick to unhash ID
> >> earlier, then from mem_cgroup_css_free().
> >
> > Are you sure. It's been some time since I've looked at the quite complex
> > cgroup tear down code but from what I remember, css_free is called on
> > the css release (aka when the reference count drops to zero). mem_cgroup_id_put_many
> > seems to unpin the css reference so we should have idr_remove by the
> > time when css_free is called. Or am I still wrong and should go over the
> > brain hurting cgroup removal code again?
>
> mem_cgroup_id_put_many() unpins css, but this may be not the last reference to the css.
> Thus, we release ID earlier, then all references to css are freed.
Right and so what. If we have released the idr then we are not going to
do that again in css_free. That is why we have that memcg->id.id > 0
check before idr_remove and memcg->id.id = 0 for the last memcg ref.
count. So again, why cannot we do the clean up in mem_cgroup_free and
have a less confusing code? Or am I just not getting your point and
being dense here?
--
Michal Hocko
SUSE Labs