Re: Possible mem cgroup bug in kernels between 4.18.0 and 5.3-rc1.

From: Michal Hocko
Date: Mon Aug 05 2019 - 10:26:27 EST

Next message: Andrew Lunn: "Re: [PATCH 16/16] dt-bindings: net: add bindings for ADIN PHY driver"
Previous message: Mimi Zohar: "Re: [PATCH v12 01/11] MODSIGN: Export module signature definitions"
In reply to: Tetsuo Handa: "Re: Possible mem cgroup bug in kernels between 4.18.0 and 5.3-rc1."
Next in thread: Tetsuo Handa: "Re: Possible mem cgroup bug in kernels between 4.18.0 and 5.3-rc1."
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

On Mon 05-08-19 23:00:12, Tetsuo Handa wrote:
> On 2019/08/05 20:44, Michal Hocko wrote:
> >> Allowing forced charge due to being unable to invoke memcg OOM killer
> >> will lead to global OOM situation, and just returning -ENOMEM will not
> >> solve memcg OOM situation.
> >
> > Returning -ENOMEM would effectivelly lead to triggering the oom killer
> > from the page fault bail out path. So effectively get us back to before
> > 29ef680ae7c21110. But it is true that this is riskier from the
> > observability POV when a) the OOM path wouldn't point to the culprit and
> > b) it would leak ENOMEM from g-u-p path.
> >
>
> Excuse me? But according to my experiment, below code showed flood of
> "Returning -ENOMEM" message instead of invoking the OOM killer.
> I didn't find it gets us back to before 29ef680ae7c21110...

You would need to declare OOM_ASYNC to return ENOMEM properly from the
charge (which is effectivelly a revert of 29ef680ae7c21110 for NOFS
allocations). Something like the following

diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index ba9138a4a1de..cc34ff0932ce 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -1797,7 +1797,7 @@ static enum oom_status mem_cgroup_oom(struct mem_cgroup *memcg, gfp_t mask, int
* Please note that mem_cgroup_out_of_memory might fail to find a
* victim and then we have to bail out from the charge path.
*/
- if (memcg->oom_kill_disable) {
+ if (memcg->oom_kill_disable || !(mask & __GFP_FS)) {
if (!current->in_user_fault)
return OOM_SKIPPED;
css_get(&memcg->css);

I am quite surprised that your patch didn't trigger the global OOM
though. It might mean that ENOMEM doesn't propagate all the way down to
the #PF handler for this path for some reason.

Anyway what I meant to say is that returning ENOMEM has the
observable issues as well.
--
Michal Hocko
SUSE Labs

Next message: Andrew Lunn: "Re: [PATCH 16/16] dt-bindings: net: add bindings for ADIN PHY driver"
Previous message: Mimi Zohar: "Re: [PATCH v12 01/11] MODSIGN: Export module signature definitions"
In reply to: Tetsuo Handa: "Re: Possible mem cgroup bug in kernels between 4.18.0 and 5.3-rc1."
Next in thread: Tetsuo Handa: "Re: Possible mem cgroup bug in kernels between 4.18.0 and 5.3-rc1."
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]