Re: [PATCH] memcg, oom: be careful about races when warning about no reclaimable task

From: Johannes Weiner
Date: Tue Aug 07 2018 - 16:16:39 EST


On Tue, Aug 07, 2018 at 07:15:11PM +0900, Tetsuo Handa wrote:
> On 2018/08/07 16:25, Michal Hocko wrote:
> > @@ -1703,7 +1703,8 @@ static enum oom_status mem_cgroup_oom(struct mem_cgroup *memcg, gfp_t mask, int
> > return OOM_ASYNC;
> > }
> >
> > - if (mem_cgroup_out_of_memory(memcg, mask, order))
> > + if (mem_cgroup_out_of_memory(memcg, mask, order) ||
> > + tsk_is_oom_victim(current))
> > return OOM_SUCCESS;
> >
> > WARN(1,"Memory cgroup charge failed because of no reclaimable memory! "
> >
>
> I don't think this patch is appropriate. This patch only avoids hitting WARN(1).
> This patch does not address the root cause:
>
> The task_will_free_mem(current) test in out_of_memory() is returning false
> because test_bit(MMF_OOM_SKIP, &mm->flags) test in task_will_free_mem() is
> returning false because MMF_OOM_SKIP was already set by the OOM reaper. The OOM
> killer does not need to start selecting next OOM victim until "current thread
> completes __mmput()" or "it fails to complete __mmput() within reasonable
> period".

I don't see why it matters whether the OOM victim exits or not, unless
you count the memory consumed by struct task_struct.

> According to https://syzkaller.appspot.com/text?tag=CrashLog&x=15a1c770400000 ,
> PID=23767 selected PID=23766 as an OOM victim and the OOM reaper set MMF_OOM_SKIP
> before PID=23766 unnecessarily selects PID=23767 as next OOM victim.
> At uptime = 366.550949, out_of_memory() should have returned true without selecting
> next OOM victim because tsk_is_oom_victim(current) == true.

The code works just fine. We have to kill tasks until we a) free
enough memory or b) run out of tasks or c) kill current. When one of
these outcomes is reached, we allow the charge and return.

The only problem here is a warning in the wrong place.