Re: [PATCH v2] mm: memcontrol: switch to rcu protection in drain_all_stock()
From: Roman Gushchin
Date: Mon Aug 05 2019 - 15:51:17 EST
On Mon, Aug 05, 2019 at 01:11:35PM +0200, Michal Hocko wrote:
> On Fri 02-08-19 12:22:41, Roman Gushchin wrote:
> > Commit 72f0184c8a00 ("mm, memcg: remove hotplug locking from try_charge")
> > introduced css_tryget()/css_put() calls in drain_all_stock(),
> > which are supposed to protect the target memory cgroup from being
> > released during the mem_cgroup_is_descendant() call.
> >
> > However, it's not completely safe. In theory, memcg can go away
> > between reading stock->cached pointer and calling css_tryget().
> >
> > This can happen if drain_all_stock() races with drain_local_stock()
> > performed on the remote cpu as a result of a work, scheduled
> > by the previous invocation of drain_all_stock().
>
> Maybe I am still missing something but I do not see how 72f0184c8a00
> changed the existing race. get_online_cpus doesn't prevent the same race
> right? If this is the case then it would be great to clarify that. I
> know that you are mostly after clarifying that css_tryget is
> insufficient but the above sounds like 72f0184c8a00 has introduced a
> regression.
Yeah, I'm not blaming 72f0184c8a00 for the race, which as I said,
is barely reproducible at all. There is no "Fixes" tag, and I don't think
we need to backport it to stable.
Let's think about this patch as a refactoring patch, which makes the code
cleaner.
>
> > The race is a bit theoretical and there are few chances to trigger
> > it, but the current code looks a bit confusing, so it makes sense
> > to fix it anyway. The code looks like as if css_tryget() and
> > css_put() are used to protect stocks drainage. It's not necessary
> > because stocked pages are holding references to the cached cgroup.
> > And it obviously won't work for works, scheduled on other cpus.
> >
> > So, let's read the stock->cached pointer and evaluate the memory
> > cgroup inside a rcu read section, and get rid of
> > css_tryget()/css_put() calls.
> >
> > v2: added some explanations to the commit message, no code changes
> >
> > Signed-off-by: Roman Gushchin <guro@xxxxxx>
> > Cc: Michal Hocko <mhocko@xxxxxxxx>
> > Cc: Hillf Danton <hdanton@xxxxxxxx>
>
> Other than that.
> Acked-by: Michal Hocko <mhocko@xxxxxxxx>
Thanks!