On Wed, 15 Jul 2009 22:38:53 -0400
Rik van Riel <riel@xxxxxxxxxx> wrote:
When way too many processes go into direct reclaim, it is possible
for all of the pages to be taken off the LRU. One result of this
is that the next process in the page reclaim code thinks there are
no reclaimable pages left and triggers an out of memory kill.
One solution to this problem is to never let so many processes into
the page reclaim path that the entire LRU is emptied. Limiting the
system to only having half of each inactive list isolated for
reclaim should be safe.
Signed-off-by: Rik van Riel <riel@xxxxxxxxxx>
---
This patch goes on top of Kosaki's "Account the number of isolated pages"
patch series.
mm/vmscan.c | 25 +++++++++++++++++++++++++
1 file changed, 25 insertions(+)
Index: mmotm/mm/vmscan.c
===================================================================
--- mmotm.orig/mm/vmscan.c 2009-07-08 21:37:01.000000000 -0400
+++ mmotm/mm/vmscan.c 2009-07-08 21:39:02.000000000 -0400
@@ -1035,6 +1035,27 @@ int isolate_lru_page(struct page *page)
}
/*
+ * Are there way too many processes in the direct reclaim path already?
+ */
+static int too_many_isolated(struct zone *zone, int file)
+{
+ unsigned long inactive, isolated;
+
+ if (current_is_kswapd())
+ return 0;
+
+ if (file) {
+ inactive = zone_page_state(zone, NR_INACTIVE_FILE);
+ isolated = zone_page_state(zone, NR_ISOLATED_FILE);
+ } else {
+ inactive = zone_page_state(zone, NR_INACTIVE_ANON);
+ isolated = zone_page_state(zone, NR_ISOLATED_ANON);
+ }
+
+ return isolated > inactive;
+}
Why this means "too much" ?
And, could you put this check under scanning_global_lru(sc) ?