Re: [PATCH] mm, vmscan: do not loop on too_many_isolated for ever

From: Johannes Weiner
Date: Thu Mar 09 2017 - 13:11:44 EST


On Tue, Mar 07, 2017 at 02:52:36PM -0500, Rik van Riel wrote:
> It only does this to some extent.  If reclaim made
> no progress, for example due to immediately bailing
> out because the number of already isolated pages is
> too high (due to many parallel reclaimers), the code
> could hit the "no_progress_loops > MAX_RECLAIM_RETRIES"
> test without ever looking at the number of reclaimable
> pages.

Hm, there is no early return there, actually. We bump the loop counter
every time it happens, but then *do* look at the reclaimable pages.

> Could that create problems if we have many concurrent
> reclaimers?

With increased concurrency, the likelihood of OOM will go up if we
remove the unlimited wait for isolated pages, that much is true.

I'm not sure that's a bad thing, however, because we want the OOM
killer to be predictable and timely. So a reasonable wait time in
between 0 and forever before an allocating thread gives up under
extreme concurrency makes sense to me.

> It may be OK, I just do not understand all the implications.
>
> I like the general direction your patch takes the code in,
> but I would like to understand it better...

I feel the same way. The throttling logic doesn't seem to be very well
thought out at the moment, making it hard to reason about what happens
in certain scenarios.

In that sense, this patch isn't really an overall improvement to the
way things work. It patches a hole that seems to be exploitable only
from an artificial OOM torture test, at the risk of regressing high
concurrency workloads that may or may not be artificial.

Unless I'm mistaken, there doesn't seem to be a whole lot of urgency
behind this patch. Can we think about a general model to deal with
allocation concurrency? Unlimited parallel direct reclaim is kinda
bonkers in the first place. How about checking for excessive isolation
counts from the page allocator and putting allocations on a waitqueue?