Re: [PATCH 06/11] mm, compaction: distinguish between full and partial COMPACT_COMPLETE

From: Vlastimil Babka
Date: Mon Apr 11 2016 - 08:10:40 EST


On 04/05/2016 01:25 PM, Michal Hocko wrote:
From: Michal Hocko <mhocko@xxxxxxxx>

COMPACT_COMPLETE now means that compaction and free scanner met. This is
not very useful information if somebody just wants to use this feedback
and make any decisions based on that. The current caller might be a poor
guy who just happened to scan tiny portion of the zone and that could be
the reason no suitable pages were compacted. Make sure we distinguish
the full and partial zone walks.

Consumers should treat COMPACT_PARTIAL_SKIPPED as a potential success
and be optimistic in retrying.

The existing users of COMPACT_COMPLETE are conservatively changed to
use COMPACT_PARTIAL_SKIPPED as well but some of them should be probably
reconsidered and only defer the compaction only for COMPACT_COMPLETE
with the new semantic.

This patch shouldn't introduce any functional changes.

Signed-off-by: Michal Hocko <mhocko@xxxxxxxx>

Acked-by: Vlastimil Babka <vbabka@xxxxxxx>

With some notes:

@@ -1463,6 +1466,10 @@ static enum compact_result compact_zone(struct zone *zone, struct compact_contro
zone->compact_cached_migrate_pfn[0] = cc->migrate_pfn;
zone->compact_cached_migrate_pfn[1] = cc->migrate_pfn;
}
+
+ if (cc->migrate_pfn == start_pfn)
+ cc->whole_zone = true;
+

This assumes that migrate scanner at initial position implies also free scanner at the initial position. That should be true, because migration scanner is the first to run. But getting the zone->compact_cached_*_pfn is racy. Worse, zone->compact_cached_migrate_pfn is array distinguishing sync and async compaction, so it's possible that async compaction has advanced both its own migrate scanner cached position, and the shared free scanner cached position, and then sync compaction starts migrate scanner at start_pfn, but free scanner has already advanced.
So you might still see a false positive COMPACT_COMPLETE, just less frequently and probably with much lower impact.
But if you need to be truly reliable, check also that cc->free_pfn == round_down(end_pfn - 1, pageblock_nr_pages)