Re: [PATCH 2/3] mm, oom: do not enfore OOM killer for __GFP_NOFAIL automatically

From: Tetsuo Handa
Date: Tue Dec 20 2016 - 10:32:00 EST


Michal Hocko wrote:
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index c8eed66d8abb..2dda7c3eba52 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -3098,32 +3098,31 @@ __alloc_pages_may_oom(gfp_t gfp_mask, unsigned int order,
> if (page)
> goto out;
>
> - if (!(gfp_mask & __GFP_NOFAIL)) {
> - /* Coredumps can quickly deplete all memory reserves */
> - if (current->flags & PF_DUMPCORE)
> - goto out;
> - /* The OOM killer will not help higher order allocs */
> - if (order > PAGE_ALLOC_COSTLY_ORDER)
> - goto out;
> - /* The OOM killer does not needlessly kill tasks for lowmem */
> - if (ac->high_zoneidx < ZONE_NORMAL)
> - goto out;
> - if (pm_suspended_storage())
> - goto out;
> - /*
> - * XXX: GFP_NOFS allocations should rather fail than rely on
> - * other request to make a forward progress.
> - * We are in an unfortunate situation where out_of_memory cannot
> - * do much for this context but let's try it to at least get
> - * access to memory reserved if the current task is killed (see
> - * out_of_memory). Once filesystems are ready to handle allocation
> - * failures more gracefully we should just bail out here.
> - */
> + /* Coredumps can quickly deplete all memory reserves */
> + if (current->flags & PF_DUMPCORE)
> + goto out;
> + /* The OOM killer will not help higher order allocs */
> + if (order > PAGE_ALLOC_COSTLY_ORDER)
> + goto out;
> + /* The OOM killer does not needlessly kill tasks for lowmem */
> + if (ac->high_zoneidx < ZONE_NORMAL)
> + goto out;
> + if (pm_suspended_storage())
> + goto out;
> + /*
> + * XXX: GFP_NOFS allocations should rather fail than rely on
> + * other request to make a forward progress.
> + * We are in an unfortunate situation where out_of_memory cannot
> + * do much for this context but let's try it to at least get
> + * access to memory reserved if the current task is killed (see
> + * out_of_memory). Once filesystems are ready to handle allocation
> + * failures more gracefully we should just bail out here.
> + */
> +
> + /* The OOM killer may not free memory on a specific node */
> + if (gfp_mask & __GFP_THISNODE)
> + goto out;
>
> - /* The OOM killer may not free memory on a specific node */
> - if (gfp_mask & __GFP_THISNODE)
> - goto out;
> - }
> /* Exhausted what can be done so it's blamo time */
> if (out_of_memory(&oc) || WARN_ON_ONCE(gfp_mask & __GFP_NOFAIL)) {
> *did_some_progress = 1;

Why do we need to change this part in this patch?

This change silently prohibits invoking the OOM killer for e.g. costly
GFP_KERNEL allocation. While it would be better if vmalloc() can be used,
there might be users who cannot accept vmalloc() as a fallback (e.g.
CONFIG_MMU=n where vmalloc() == kmalloc() ?).

This change is not "do not enforce OOM killer automatically" but "never allow
OOM killer". No exception is allowed. If we change this part, title for this part
should be something strong like "mm,oom: Never allow OOM killer for coredumps,
costly allocations, lowmem etc.".