Re: Possible mem cgroup bug in kernels between 4.18.0 and 5.3-rc1.
From: Michal Hocko
Date: Mon Aug 05 2019 - 10:26:27 EST
On Mon 05-08-19 23:00:12, Tetsuo Handa wrote:
> On 2019/08/05 20:44, Michal Hocko wrote:
> >> Allowing forced charge due to being unable to invoke memcg OOM killer
> >> will lead to global OOM situation, and just returning -ENOMEM will not
> >> solve memcg OOM situation.
> >
> > Returning -ENOMEM would effectivelly lead to triggering the oom killer
> > from the page fault bail out path. So effectively get us back to before
> > 29ef680ae7c21110. But it is true that this is riskier from the
> > observability POV when a) the OOM path wouldn't point to the culprit and
> > b) it would leak ENOMEM from g-u-p path.
> >
>
> Excuse me? But according to my experiment, below code showed flood of
> "Returning -ENOMEM" message instead of invoking the OOM killer.
> I didn't find it gets us back to before 29ef680ae7c21110...
You would need to declare OOM_ASYNC to return ENOMEM properly from the
charge (which is effectivelly a revert of 29ef680ae7c21110 for NOFS
allocations). Something like the following
diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index ba9138a4a1de..cc34ff0932ce 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -1797,7 +1797,7 @@ static enum oom_status mem_cgroup_oom(struct mem_cgroup *memcg, gfp_t mask, int
* Please note that mem_cgroup_out_of_memory might fail to find a
* victim and then we have to bail out from the charge path.
*/
- if (memcg->oom_kill_disable) {
+ if (memcg->oom_kill_disable || !(mask & __GFP_FS)) {
if (!current->in_user_fault)
return OOM_SKIPPED;
css_get(&memcg->css);
I am quite surprised that your patch didn't trigger the global OOM
though. It might mean that ENOMEM doesn't propagate all the way down to
the #PF handler for this path for some reason.
Anyway what I meant to say is that returning ENOMEM has the
observable issues as well.
--
Michal Hocko
SUSE Labs