Re: [PATCH 0/2] oom, memcg: do not report racy no-eligible OOM

From: Tetsuo Handa
Date: Fri Jan 11 2019 - 10:02:29 EST


On 2019/01/11 22:34, Michal Hocko wrote:
> On Fri 11-01-19 21:40:52, Tetsuo Handa wrote:
> [...]
>> Did you notice that there is no
>>
>> "Killed process %d (%s) total-vm:%lukB, anon-rss:%lukB, file-rss:%lukB, shmem-rss:%lukB\n"
>>
>> line between
>>
>> [ 71.304703][ T9694] Memory cgroup out of memory: Kill process 9692 (a.out) score 904 or sacrifice child
>>
>> and
>>
>> [ 71.309149][ T54] oom_reaper: reaped process 9750 (a.out), now anon-rss:0kB, file-rss:0kB, shmem-rss:185532kB
>>
>> ? Then, you will find that [ T9694] failed to reach for_each_process(p) loop inside
>> __oom_kill_process() in the first round of out_of_memory() call because
>> find_lock_task_mm() == NULL at __oom_kill_process() because Ctrl-C made that victim
>> complete exit_mm() before find_lock_task_mm() is called.
>
> OK, so we haven't killed anything because the victim has exited by the
> time we wanted to do so. We still have other tasks sharing that mm
> pending and not killed because nothing has killed them yet, right?

The OOM killer invoked by [ T9694] called printk() but didn't kill anything.
Instead, SIGINT from Ctrl-C killed all thread groups sharing current->mm.

>
> How come the oom reaper could act on this oom event at all then?
>
> What am I missing?
>

The OOM killer invoked by [ T9750] did not call printk() but hit
task_will_free_mem(current) in out_of_memory() and invoked the OOM reaper,
without calling mark_oom_victim() on all thread groups sharing current->mm.
Did you notice that I wrote that

Since mm-oom-marks-all-killed-tasks-as-oom-victims.patch does not call mark_oom_victim()
when task_will_free_mem() == true,

? :-(