Re: [RFC PATCH 2/2] mm, oom: do not trigger out_of_memory from the #PF

From: Tetsuo Handa
Date: Fri May 19 2017 - 09:02:52 EST

Next message: Christoph Hellwig: "Re: [PATCH] ib/core: not to set page dirty bit if it's already set."
Previous message: Petr Mladek: "Re: [PATCH v2] printk: Use the main logbuf in NMI when logbuf_lock is available"
In reply to: Michal Hocko: "[RFC PATCH 2/2] mm, oom: do not trigger out_of_memory from the #PF"
Next in thread: Michal Hocko: "Re: [RFC PATCH 2/2] mm, oom: do not trigger out_of_memory from the #PF"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

Michal Hocko wrote:
> Any allocation failure during the #PF path will return with VM_FAULT_OOM
> which in turn results in pagefault_out_of_memory. This can happen for
> 2 different reasons. a) Memcg is out of memory and we rely on
> mem_cgroup_oom_synchronize to perform the memcg OOM handling or b)
> normal allocation fails.
>
> The later is quite problematic because allocation paths already trigger
> out_of_memory and the page allocator tries really hard to not fail

We made many memory allocation requests from page fault path (e.g. XFS)
__GFP_FS some time ago, didn't we? But if I recall correctly (I couldn't
find the message), there are some allocation requests from page fault path
which cannot use __GFP_FS. Then, not all allocation requests can call
oom_kill_process() and reaching pagefault_out_of_memory() will be
inevitable.

> allocations. Anyway, if the OOM killer has been already invoked there
> is no reason to invoke it again from the #PF path. Especially when the
> OOM condition might be gone by that time and we have no way to find out
> other than allocate.
>
> Moreover if the allocation failed and the OOM killer hasn't been
> invoked then we are unlikely to do the right thing from the #PF context
> because we have already lost the allocation context and restictions and
> therefore might oom kill a task from a different NUMA domain.

If we carry a flag via task_struct that indicates whether it is an memory
allocation request from page fault and allocation failure is not acceptable,
we can call out_of_memory() from page allocator path.

> - if (!mutex_trylock(&oom_lock))
> + if (fatal_signal_pending)

fatal_signal_pending(current)

By the way, can page fault occur after reaching do_exit()? When a thread
reached do_exit(), fatal_signal_pending(current) becomes false, doesn't it?

Next message: Christoph Hellwig: "Re: [PATCH] ib/core: not to set page dirty bit if it's already set."
Previous message: Petr Mladek: "Re: [PATCH v2] printk: Use the main logbuf in NMI when logbuf_lock is available"
In reply to: Michal Hocko: "[RFC PATCH 2/2] mm, oom: do not trigger out_of_memory from the #PF"
Next in thread: Michal Hocko: "Re: [RFC PATCH 2/2] mm, oom: do not trigger out_of_memory from the #PF"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]