[PATCH v2 9/9] mm/compaction: new threshold for compaction depleted zone

From: Joonsoo Kim
Date: Sun Aug 23 2015 - 22:20:46 EST


Now, compaction algorithm become powerful. Migration scanner traverses
whole zone range. So, old threshold for depleted zone which is designed
to imitate compaction deferring approach isn't appropriate for current
compaction algorithm. If we adhere to current threshold, 1, we can't
avoid excessive overhead caused by compaction, because one compaction
for low order allocation would be easily successful in any situation.

This patch re-implements threshold calculation based on zone size and
allocation requested order. We judge whther compaction possibility is
depleted or not by number of successful compaction. Roughly, 1/100
of future scanned area should be allocated for high order page during
one comaction iteration in order to determine whether zone's compaction
possiblity is depleted or not.

Below is test result with following setup.

Memory is artificially fragmented to make order 3 allocation hard. And,
most of pageblocks are changed to movable migratetype.

System: 512 MB with 32 MB Zram
Memory: 25% memory is allocated to make fragmentation and 200 MB is
occupied by memory hogger. Most pageblocks are movable
migratetype.
Fragmentation: Successful order 3 allocation candidates may be around
1500 roughly.
Allocation attempts: Roughly 3000 order 3 allocation attempts
with GFP_NORETRY. This value is determined to saturate allocation
success.

Test: hogger-frag-movable

Success(N) 94 83
compact_stall 3642 4048
compact_success 144 212
compact_fail 3498 3835
pgmigrate_success 15897219 216387
compact_isolated 31899553 487712
compact_migrate_scanned 59146745 2513245
compact_free_scanned 49566134 4124319

This change results in greatly decreasing compaction overhead when
zone's compaction possibility is nearly depleted. But, I should admit
that it's not perfect because compaction success rate is decreased.
More precise tuning threshold would restore this regression, but,
it highly depends on workload so I'm not doing it here.

Other test doesn't show big regression.

System: 512 MB with 32 MB Zram
Memory: 25% memory is allocated to make fragmentation and kernel
build is running on background. Most pageblocks are movable
migratetype.
Fragmentation: Successful order 3 allocation candidates may be around
1500 roughly.
Allocation attempts: Roughly 3000 order 3 allocation attempts
with GFP_NORETRY. This value is determined to saturate allocation
success.

Test: build-frag-movable

Success(N) 89 87
compact_stall 4053 3642
compact_success 264 202
compact_fail 3788 3440
pgmigrate_success 6497642 153413
compact_isolated 13292640 353445
compact_migrate_scanned 69714502 2307433
compact_free_scanned 20243121 2325295

This looks like reasonable trade-off.

Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@xxxxxxx>
---
mm/compaction.c | 19 ++++++++++++-------
1 file changed, 12 insertions(+), 7 deletions(-)

diff --git a/mm/compaction.c b/mm/compaction.c
index e61ee77..e1b44a5 100644
--- a/mm/compaction.c
+++ b/mm/compaction.c
@@ -129,19 +129,24 @@ static struct page *pageblock_pfn_to_page(unsigned long start_pfn,

/* Do not skip compaction more than 64 times */
#define COMPACT_MAX_FAILED 4
-#define COMPACT_MIN_DEPLETE_THRESHOLD 1UL
+#define COMPACT_MIN_DEPLETE_THRESHOLD 4UL
#define COMPACT_MIN_SCAN_LIMIT (pageblock_nr_pages)

static bool compaction_depleted(struct zone *zone)
{
- unsigned long threshold;
+ unsigned long nr_possible;
unsigned long success = zone->compact_success;
+ unsigned long threshold;

- /*
- * Now, to imitate current compaction deferring approach,
- * choose threshold to 1. It will be changed in the future.
- */
- threshold = COMPACT_MIN_DEPLETE_THRESHOLD;
+ nr_possible = zone->managed_pages >> zone->compact_order_failed;
+
+ /* Migration scanner normally scans less than 1/4 range of zone */
+ nr_possible >>= 2;
+
+ /* We hope to succeed more than 1/100 roughly */
+ threshold = nr_possible >> 7;
+
+ threshold = max(threshold, COMPACT_MIN_DEPLETE_THRESHOLD);
if (success >= threshold)
return false;

--
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/