Re: BUG: Bad page state in process - page dumped because: page still charged to cgroup

From: Roman Gushchin
Date: Thu Jul 02 2020 - 13:07:59 EST


On Thu, Jul 02, 2020 at 06:35:31PM +0200, Vlastimil Babka wrote:
> On 7/2/20 6:22 PM, Michal Hocko wrote:
> > On Wed 01-07-20 11:45:52, Roman Gushchin wrote:
> > [...]
> >> >From c97afecd32c0db5e024be9ba72f43d22974f5bcd Mon Sep 17 00:00:00 2001
> >> From: Roman Gushchin <guro@xxxxxx>
> >> Date: Wed, 1 Jul 2020 11:05:32 -0700
> >> Subject: [PATCH] mm: kmem: make memcg_kmem_enabled() irreversible
> >>
> >> Historically the kernel memory accounting was an opt-in feature, which
> >> could be enabled for individual cgroups. But now it's not true, and
> >> it's on by default both on cgroup v1 and cgroup v2. And as long as a
> >> user has at least one non-root memory cgroup, the kernel memory
> >> accounting is on. So in most setups it's either always on (if memory
> >> cgroups are in use and kmem accounting is not disabled), either always
> >> off (otherwise).
> >>
> >> memcg_kmem_enabled() is used in many places to guard the kernel memory
> >> accounting code. If memcg_kmem_enabled() can reverse from returning
> >> true to returning false (as now), we can't rely on it on release paths
> >> and have to check if it was on before.
> >>
> >> If we'll make memcg_kmem_enabled() irreversible (always returning true
> >> after returning it for the first time), it'll make the general logic
> >> more simple and robust. It also will allow to guard some checks which
> >> otherwise would stay unguarded.
> >>
> >> Signed-off-by: Roman Gushchin <guro@xxxxxx>
>
> Fixes: ? or let Andrew squash it to some patch of your series (it's in mmotm I
> think)?

Hm, it's actually complicated. One obvious case was added by
"mm: memcg/slab: save obj_cgroup for non-root slab objects",
which is currently in the mm tree, so no stable hash.

But I suspect that there are more cases where we just silently leaking
a memcg reference. But because the whole setup (going back and forth
between 0 and 1+ memory cgroups) can not be easily found in the real life,
nobody cares. So I don't think we really need a stable backport.

So IMO the best option is to put it as a standalone patch _before_
my series. Does it sound good to you?

>
> Acked-by: Vlastimil Babka <vbabka@xxxxxxx>

Thanks!

>
> But see below:
>
> >> ---
> >> mm/memcontrol.c | 6 ++----
> >> 1 file changed, 2 insertions(+), 4 deletions(-)
> >>
> >> diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> >> index 50ae77f3985e..2d018a51c941 100644
> >> --- a/mm/memcontrol.c
> >> +++ b/mm/memcontrol.c
> >> @@ -3582,7 +3582,8 @@ static int memcg_online_kmem(struct mem_cgroup *memcg)
> >> objcg->memcg = memcg;
> >> rcu_assign_pointer(memcg->objcg, objcg);
> >>
> >> - static_branch_inc(&memcg_kmem_enabled_key);
> >> + if (!memcg_kmem_enabled())
> >> + static_branch_inc(&memcg_kmem_enabled_key);
> >
> > Wouldn't be static_branch_enable() more readable?
>
> Yes, and drop the if(). It will just do nothing and return if already enabled.
> Maybe slightly less efficient, but this is not a fast path anyway, and it feels
> weird to modify the static key in a branch controlled by the static key itself
> (CC peterz in case he wants to add something).

Ok, will do in v2.

Thanks!