Re: [PATCH] fdtable: Avoid triggering OOMs from alloc_fdmem

From: David Rientjes
Date: Tue Feb 04 2014 - 16:27:49 EST


On Tue, 4 Feb 2014, Eric W. Biederman wrote:

> My gut feel says if there is a code path that has __GFP_NOWARN and
> because of PAGE_ALLOC_COSTLY_ORDER we loop forever then there is
> something fishy going on.
>

The __GFP_NOWARN without __GFP_NORETRY in alloc_fdmem() is pointless
because we already know that the allocation is PAGE_ALLOC_COSTLY_ORDER or
smaller. That function encodes specific knowledge of the page allocator's
implementation so it leads me to believe that __GFP_NOWARN was intended to
be __GFP_NORETRY from the start. Otherwise, it's just set pointlessly and
specifically allows for the oom killing that you're now reporting. Since
it can fallback to vmalloc() after exhausting all of the page allocator's
capabilities, the __GFP_NOWARN|__GFP_NORETRY seems entirely appropriate.

The vmalloc() has never been called in this function because of the
infinite loop in kmalloc() because of its allocation context, but it
definitely seems better than oom killing something.

Acked-by: David Rientjes <rientjes@xxxxxxxxxx>

> I would love to hear some people who are more current on the mm
> subsystem than I am chime in. It might be that the darn fix is going to
> be to teach __alloc_pages_slowpath to not loop forever, unless order == 0.

It doesn't loop forever, it will either return NULL because of its
allocation context or the oom killer will kill something, even for order-3
allocations. In the case that you've modified, you have sane fallback
behavior that can be utilized rather than the oom killer and __GFP_NORETRY
was reasonable from the start.

The question is simple enough: do we want to change
PAGE_ALLOC_COSTLY_ORDER to be smaller so that order-3 does return NULL
without oom killing? Perhaps there's an argument to be made that does
exactly that, but by not setting __GFP_NORETRY you are really demanding
order-3 memory at the time you allocate it and are willing to accept the
consequences to free that memory. Should we make everything except for
order-0 inherently __GFP_NORETRY and introduce a replacement __GFP_RETRY?
That's doable as well, but it would be a massive effort.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/