[55/70] vmscan: fix a livelock in kswapd

From: Greg KH
Date: Mon Aug 01 2011 - 19:27:02 EST


2.6.39-stable review patch. If anyone has any objections, please let us know.

------------------

From: Shaohua Li <shaohua.li@xxxxxxxxx>

commit 4746efded84d7c5a9c8d64d4c6e814ff0cf9fb42 upstream.

I'm running a workload which triggers a lot of swap in a machine with 4
nodes. After I kill the workload, I found a kswapd livelock. Sometimes
kswapd3 or kswapd2 are keeping running and I can't access filesystem,
but most memory is free.

This looks like a regression since commit 08951e545918c159 ("mm: vmscan:
correct check for kswapd sleeping in sleeping_prematurely").

Node 2 and 3 have only ZONE_NORMAL, but balance_pgdat() will return 0
for classzone_idx. The reason is end_zone in balance_pgdat() is 0 by
default, if all zones have watermark ok, end_zone will keep 0.

Later sleeping_prematurely() always returns true. Because this is an
order 3 wakeup, and if classzone_idx is 0, both balanced_pages and
present_pages in pgdat_balanced() are 0. We add a special case here.
If a zone has no page, we think it's balanced. This fixes the livelock.

Signed-off-by: Shaohua Li <shaohua.li@xxxxxxxxx>
Acked-by: Mel Gorman <mgorman@xxxxxxx>
Cc: Minchan Kim <minchan.kim@xxxxxxxxx>
Signed-off-by: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx>
Signed-off-by: Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx>
Signed-off-by: Greg Kroah-Hartman <gregkh@xxxxxxx>

---
mm/vmscan.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)

--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -2245,7 +2245,8 @@ static bool pgdat_balanced(pg_data_t *pg
for (i = 0; i <= classzone_idx; i++)
present_pages += pgdat->node_zones[i].present_pages;

- return balanced_pages > (present_pages >> 2);
+ /* A special case here: if zone has no page, we think it's balanced */
+ return balanced_pages >= (present_pages >> 2);
}

/* is kswapd sleeping prematurely? */


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/