Re: [patch v4] mm, oom: fix unnecessary killing of additional processes

From: Tetsuo Handa
Date: Tue Jul 24 2018 - 18:32:10 EST

On 2018/07/25 6:45, David Rientjes wrote:
> On Sat, 21 Jul 2018, Tetsuo Handa wrote:
>> You can't apply "[patch v4] mm, oom: fix unnecessary killing of additional processes"
>> because Michal's patch which removes oom_lock serialization was added to -mm tree.
> I've rebased the patch to linux-next and posted a v5.
>> You might worry about situations where __oom_reap_task_mm() is a no-op.
>> But that is not always true. There is no point with emitting
>> pr_info("oom_reaper: unable to reap pid:%d (%s)\n", ...);
>> debug_show_all_locks();
>> noise and doing
>> set_bit(MMF_OOM_SKIP, &mm->flags);
>> because exit_mmap() will not release oom_lock until __oom_reap_task_mm()
>> completes. That is, except extra noise, there is no difference with
>> current behavior which sets set_bit(MMF_OOM_SKIP, &mm->flags) after
>> returning from __oom_reap_task_mm().
> v5 has restructured how exit_mmap() serializes its unmapping with the oom
> reaper. It sets MMF_OOM_SKIP while holding mm->mmap_sem.

I think that v5 is still wrong. exit_mmap() keeps mmap_sem held for write does
not prevent oom_reap_task() from emitting the noise and setting MMF_OOM_SKIP
after timeout. Since your purpose is to wait for release of memory which could
not be reclaimed by __oom_reap_task_mm(), what if __oom_reap_task_mm() was no-op and
exit_mmap() was preempted immediately after returning from __oom_reap_task_mm() ?

Also, I believe that userspace visible knob is not needed.