Re: [patch v2] mm, oom: fix concurrent munlock and oom reaper unmap

From: Tetsuo Handa
Date: Thu Apr 19 2018 - 06:46:09 EST


Michal Hocko wrote:
> > exit_mmap() does not block before set_bit(MMF_OOM_SKIP) once it is
> > entered.
>
> Not true. munlock_vma_pages_all might take page_lock which can have
> unpredictable dependences. This is the reason why we are ruling out
> mlocked VMAs in the first place when reaping the address space.

Wow! Then,

> While you are correct, strictly speaking, because unmap_vmas can race
> with the oom reaper. With the lock held during the whole operation we
> can indeed trigger back off in the oom_repaer. It will keep retrying but
> the tear down can take quite some time. This is a fair argument. On the
> other hand your lock protocol introduces the MMF_OOM_SKIP problem I've
> mentioned above and that really worries me. The primary objective of the
> reaper is to guarantee a forward progress without relying on any
> externalities. We might kill another OOM victim but that is safer than
> lock up.

current code has a possibility that the OOM reaper is disturbed by
unpredictable dependencies, like I worried that

I think that there is a possibility that the OOM reaper tries to reclaim
mlocked pages as soon as exit_mmap() cleared VM_LOCKED flag by calling
munlock_vma_pages_all().

when current approach was proposed. We currently have the MMF_OOM_SKIP problem.
We need to teach the OOM reaper stop reaping as soon as entering exit_mmap().
Maybe let the OOM reaper poll for progress (e.g. none of get_mm_counter(mm, *)
decreased for last 1 second) ?