Re: [PATCH] mm: limit direct reclaim for higher order allocations

From: Vlastimil Babka
Date: Mon Mar 07 2016 - 10:42:32 EST


On 02/24/2016 10:38 PM, Rik van Riel wrote:
For multi page allocations smaller than PAGE_ALLOC_COSTLY_ORDER,
the kernel will do direct reclaim if compaction failed for any
reason. This worked fine when Linux systems had 128MB RAM, but
on my 24GB system I frequently see higher order allocations
free up over 3GB of memory, pushing all kinds of things into
swap, and slowing down applications.

It would be much better to limit the amount of reclaim done,
rather than cause excessive pageout activity.

When enough memory is free to do compaction for the highest order
allocation possible, bail out of the direct page reclaim code.

On smaller systems, this may be enough to obtain contiguous
free memory areas to satisfy small allocations, continuing our
strategy of relying on luck occasionally. On larger systems,
relying on luck like that has not been working for years.

Signed-off-by: Rik van Riel <riel@xxxxxxxxxx>

So the main point of this patch is the change from "continue" to "return true", right? This will prevent looking at other zones, but I guess that's not the reason why without this patch reclaim frees 3 of 24GB for you?

What I suspect more is should_continue_reclaim() where it wants to reclaim (2UL << sc->order) pages regardless of watermark, or compaction status. But that one is called from shrink_zone(), and shrink_zones() should not call shrink_zone() if compaction is ready, even before this patch. Perhaps if multiple processes manage to enter shrink_zone() simultaneously, they could over-reclaim due to that?

---
mm/vmscan.c | 19 ++++++++-----------
1 file changed, 8 insertions(+), 11 deletions(-)

diff --git a/mm/vmscan.c b/mm/vmscan.c
index fc62546096f9..8dd15d514761 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -2584,20 +2584,17 @@ static bool shrink_zones(struct zonelist *zonelist, struct scan_control *sc)
continue; /* Let kswapd poll it */

/*
- * If we already have plenty of memory free for
- * compaction in this zone, don't free any more.
- * Even though compaction is invoked for any
- * non-zero order, only frequent costly order
- * reclamation is disruptive enough to become a
- * noticeable problem, like transparent huge
- * page allocations.
+ * For higher order allocations, free enough memory
+ * to be able to do compaction for the largest possible
+ * allocation. On smaller systems, this may be enough
+ * that smaller allocations can skip compaction, if
+ * enough adjacent pages get freed.
*/
- if (IS_ENABLED(CONFIG_COMPACTION) &&
- sc->order > PAGE_ALLOC_COSTLY_ORDER &&
+ if (IS_ENABLED(CONFIG_COMPACTION) && sc->order &&
zonelist_zone_idx(z) <= requested_highidx &&
- compaction_ready(zone, sc->order)) {
+ compaction_ready(zone, MAX_ORDER)) {
sc->compaction_ready = true;
- continue;
+ return true;
}

/*