[PATCH v2 5/5] mm, compaction: adapt isolation_suitable flushing to kcompactd
From: Vlastimil Babka
Date: Mon Feb 08 2016 - 08:38:50 EST
Compaction maintains a pageblock_skip bitmap to record pageblocks where
isolation recently failed. This bitmap can be reset by three ways:
1) direct compaction is restarting after going through the full deferred cycle
2) kswapd goes to sleep, and some other direct compaction has previously
finished scanning the whole zone and set zone->compact_blockskip_flush.
Note that a successful direct compaction clears this flag.
3) compaction was invoked manually via trigger in /proc
The case 2) is somewhat fuzzy to begin with, but after introducing kcompactd we
should update it. The check for direct compaction in 1), and to set the flush
flag in 2) use current_is_kswapd(), which doesn't work for kcompactd. Thus,
this patch adds bool direct_compaction to compact_control to use in 2). For
the case 1) we remove the check completely - unlike the former kswapd
compaction, kcompactd does use the deferred compaction functionality, so
flushing tied to restarting from deferred compaction makes sense here.
Note that when kswapd goes to sleep, kcompactd is woken up, so it will see the
flushed pageblock_skip bits. This is different from when the former kswapd
compaction observed the bits and I believe it makes more sense. Kcompactd can
afford to be more thorough than a direct compaction trying to limit allocation
latency, or kswapd whose primary goal is to reclaim.
To sum up, after this patch, the pageblock_skip flushing makes intuitively
more sense for kcompactd. Practially, the differences are minimal.
Stress-highalloc With order-9 allocations without direct reclaim/compaction:
stress-highalloc
4.5-rc1 4.5-rc1
4-test 5-test
Success 1 Min 3.00 ( 0.00%) 5.00 (-66.67%)
Success 1 Mean 4.00 ( 0.00%) 6.20 (-55.00%)
Success 1 Max 6.00 ( 0.00%) 7.00 (-16.67%)
Success 2 Min 3.00 ( 0.00%) 5.00 (-66.67%)
Success 2 Mean 4.20 ( 0.00%) 6.40 (-52.38%)
Success 2 Max 6.00 ( 0.00%) 7.00 (-16.67%)
Success 3 Min 63.00 ( 0.00%) 62.00 ( 1.59%)
Success 3 Mean 64.60 ( 0.00%) 63.80 ( 1.24%)
Success 3 Max 67.00 ( 0.00%) 65.00 ( 2.99%)
4.5-rc1 4.5-rc1
4-test 5-test
User 3088.82 3181.09
System 1142.01 1158.25
Elapsed 1780.91 1799.37
4.5-rc1 4.5-rc1
4-test 5-test
Minor Faults 106582816 107907437
Major Faults 813 734
Swap Ins 311 235
Swap Outs 5598 5485
Allocation stalls 184 207
DMA allocs 32 31
DMA32 allocs 74843238 75757965
Normal allocs 25886668 26130990
Movable allocs 0 0
Direct pages scanned 31429 32797
Kswapd pages scanned 2185293 2202613
Kswapd pages reclaimed 2134389 2143524
Direct pages reclaimed 31234 32545
Kswapd efficiency 97% 97%
Kswapd velocity 1228.666 1218.536
Direct efficiency 99% 99%
Direct velocity 17.671 18.144
Percentage direct scans 1% 1%
Zone normal velocity 291.409 286.309
Zone dma32 velocity 954.928 950.371
Zone dma velocity 0.000 0.000
Page writes by reclaim 5598.600 5485.600
Page writes file 0 0
Page writes anon 5598 5485
Page reclaim immediate 96 60
Sector Reads 4307161 4293509
Sector Writes 11053091 11072127
Page rescued immediate 0 0
Slabs scanned 1555770 1549506
Direct inode steals 2025 7018
Kswapd inode steals 45418 40265
Kswapd skipped wait 0 0
THP fault alloc 614 612
THP collapse alloc 324 316
THP splits 0 0
THP fault fallback 730 778
THP collapse fail 14 16
Compaction stalls 959 1007
Compaction success 69 67
Compaction failures 890 939
Page migrate success 662054 721374
Page migrate failure 32846 23469
Compaction pages isolated 1370326 1479924
Compaction migrate scanned 7025772 8812554
Compaction free scanned 73302642 84327916
Compaction cost 762 838
With direct reclaim/compaction:
stress-highalloc
/home/vbabka/labs/mmtests-results/storm/2016-02-02_16-37/test2/1
4.5-rc1 4.5-rc1
4-test2 5-test2
Success 1 Min 6.00 ( 0.00%) 9.00 (-50.00%)
Success 1 Mean 8.40 ( 0.00%) 10.00 (-19.05%)
Success 1 Max 13.00 ( 0.00%) 11.00 ( 15.38%)
Success 2 Min 6.00 ( 0.00%) 9.00 (-50.00%)
Success 2 Mean 8.60 ( 0.00%) 10.00 (-16.28%)
Success 2 Max 12.00 ( 0.00%) 11.00 ( 8.33%)
Success 3 Min 75.00 ( 0.00%) 74.00 ( 1.33%)
Success 3 Mean 75.60 ( 0.00%) 75.20 ( 0.53%)
Success 3 Max 76.00 ( 0.00%) 76.00 ( 0.00%)
4.5-rc1 4.5-rc1
4-test2 5-test2
User 3258.62 3246.04
System 1177.92 1172.29
Elapsed 1837.02 1836.76
4.5-rc1 4.5-rc1
4-test2 5-test2
Minor Faults 109392253 109773220
Minor Faults 109392253 109773220
Major Faults 755 864
Swap Ins 155 262
Swap Outs 5790 5871
Allocation stalls 4562 4540
DMA allocs 34 39
DMA32 allocs 76901680 77122082
Normal allocs 26587089 26748274
Movable allocs 0 0
Direct pages scanned 108854 120966
Kswapd pages scanned 2131589 2135012
Kswapd pages reclaimed 2090937 2108388
Direct pages reclaimed 108699 120577
Kswapd efficiency 98% 98%
Kswapd velocity 1160.870 1170.537
Direct efficiency 99% 99%
Direct velocity 59.283 66.321
Percentage direct scans 4% 5%
Zone normal velocity 294.389 293.821
Zone dma32 velocity 925.764 943.036
Zone dma velocity 0.000 0.000
Page writes by reclaim 5790.600 5871.200
Page writes file 0 0
Page writes anon 5790 5871
Page reclaim immediate 218 225
Sector Reads 4376989 4428264
Sector Writes 11102113 11110668
Page rescued immediate 0 0
Slabs scanned 1692486 1709123
Direct inode steals 16266 6898
Kswapd inode steals 28364 38351
Kswapd skipped wait 0 0
THP fault alloc 567 652
THP collapse alloc 326 354
THP splits 0 0
THP fault fallback 805 793
THP collapse fail 18 16
Compaction stalls 2070 2025
Compaction success 527 518
Compaction failures 1543 1507
Page migrate success 2423657 2360608
Page migrate failure 28790 40852
Compaction pages isolated 4916017 4802025
Compaction migrate scanned 19370264 21750613
Compaction free scanned 360662356 344372001
Compaction cost 2745 2694
Singed-off-by: Vlastimil Babka <vbabka@xxxxxxx>
---
mm/compaction.c | 10 +++++-----
mm/internal.h | 1 +
2 files changed, 6 insertions(+), 5 deletions(-)
diff --git a/mm/compaction.c b/mm/compaction.c
index c03715ba65c7..67bb651c56b1 100644
--- a/mm/compaction.c
+++ b/mm/compaction.c
@@ -1191,11 +1191,11 @@ static int __compact_finished(struct zone *zone, struct compact_control *cc,
/*
* Mark that the PG_migrate_skip information should be cleared
- * by kswapd when it goes to sleep. kswapd does not set the
+ * by kswapd when it goes to sleep. kcompactd does not set the
* flag itself as the decision to be clear should be directly
* based on an allocation request.
*/
- if (!current_is_kswapd())
+ if (cc->direct_compaction)
zone->compact_blockskip_flush = true;
return COMPACT_COMPLETE;
@@ -1338,10 +1338,9 @@ static int compact_zone(struct zone *zone, struct compact_control *cc)
/*
* Clear pageblock skip if there were failures recently and compaction
- * is about to be retried after being deferred. kswapd does not do
- * this reset as it'll reset the cached information when going to sleep.
+ * is about to be retried after being deferred.
*/
- if (compaction_restarting(zone, cc->order) && !current_is_kswapd())
+ if (compaction_restarting(zone, cc->order))
__reset_isolation_suitable(zone);
/*
@@ -1477,6 +1476,7 @@ static unsigned long compact_zone_order(struct zone *zone, int order,
.mode = mode,
.alloc_flags = alloc_flags,
.classzone_idx = classzone_idx,
+ .direct_compaction = true,
};
INIT_LIST_HEAD(&cc.freepages);
INIT_LIST_HEAD(&cc.migratepages);
diff --git a/mm/internal.h b/mm/internal.h
index 17ae0b52534b..013a786fa37f 100644
--- a/mm/internal.h
+++ b/mm/internal.h
@@ -181,6 +181,7 @@ struct compact_control {
unsigned long last_migrated_pfn;/* Not yet flushed page being freed */
enum migrate_mode mode; /* Async or sync migration mode */
bool ignore_skip_hint; /* Scan blocks even if marked skip */
+ bool direct_compaction; /* False from kcompactd or /proc/... */
int order; /* order a direct compactor needs */
const gfp_t gfp_mask; /* gfp mask of a direct compactor */
const int alloc_flags; /* alloc flags of a direct compactor */
--
2.7.0