Re: 4.8.8 kernel trigger OOM killer repeatedly when I have lots of RAM that should be free

From: Marc MERLIN
Date: Mon Nov 28 2016 - 15:55:23 EST


On Mon, Nov 28, 2016 at 08:23:15AM +0100, Michal Hocko wrote:
> Marc, could you try this patch please? I think it should be pretty clear
> it should help you but running it through your use case would be more
> than welcome before I ask Greg to take this to the 4.8 stable tree.

This will take a little while, the whole copy took 5 days to finish and I'm a
bit hesitant about blowing it away and starting over :)
Let me see if I can come up with maybe another disk array for another test.

For now, as a reminder, I'm running that attached patch, and it works fine
I'll report back as soon as I can.

Marc
--
"A mouse is a device used to point at the xterm you want to type in" - A.S.R.
Microsoft is to operating systems ....
.... what McDonalds is to gourmet cooking
Home page: http://marc.merlins.org/
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index a2214c64ed3c..9b3b3a79c58a 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -3347,17 +3347,24 @@ should_reclaim_retry(gfp_t gfp_mask, unsigned order,
ac->nodemask) {
unsigned long available;
unsigned long reclaimable;
+ int check_order = order;
+ unsigned long watermark = min_wmark_pages(zone);

available = reclaimable = zone_reclaimable_pages(zone);
available -= DIV_ROUND_UP(no_progress_loops * available,
MAX_RECLAIM_RETRIES);
available += zone_page_state_snapshot(zone, NR_FREE_PAGES);

+ if (order > 0 && order <= PAGE_ALLOC_COSTLY_ORDER) {
+ check_order = 0;
+ watermark += 1UL << order;
+ }
+
/*
* Would the allocation succeed if we reclaimed the whole
* available?
*/
- if (__zone_watermark_ok(zone, order, min_wmark_pages(zone),
+ if (__zone_watermark_ok(zone, check_order, watermark,
ac_classzone_idx(ac), alloc_flags, available)) {
/*
* If we didn't make any progress and have a lot of