Re: [PATCH] mm, compaction: make fast_isolate_freepages() stay within zone

From: Vlastimil Babka
Date: Thu Feb 18 2021 - 14:11:24 EST


On 2/17/21 6:33 PM, Vlastimil Babka wrote:
> Compaction always operates on pages from a single given zone when isolating
> both pages to migrate and freepages. Pageblock boundaries are intersected with
> zone boundaries to be safe in case zone starts or ends in the middle of
> pageblock. The use of pageblock_pfn_to_page() protects against non-contiguous
> pageblocks.
>
> The functions fast_isolate_freepages() and fast_isolate_around() don't
> currently protect the fast freepage isolation thoroughly enough against these
> corner cases, and can result in freepage isolation operate outside of zone
> boundaries:
>
> - in fast_isolate_freepages() if we get a pfn from the first pageblock of a
> zone that starts in the middle of that pageblock, 'highest' can be a pfn
> outside of the zone. If we fail to isolate anything in this function, we
> may then call fast_isolate_around() on a pfn outside of the zone and there
> effectively do a set_pageblock_skip(page_to_pfn(highest)) which may currently
> hit a VM_BUG_ON() in some configurations
> - fast_isolate_around() checks only the zone end boundary and not beginning,
> nor that the pageblock is contiguous (with pageblock_pfn_to_page()) so it's
> possible that we end up calling isolate_freepages_block() on a range of pfn's
> from two different zones and end up e.g. isolating freepages under the wrong
> zone's lock.
>
> This patch should fix the above issues.

Sorry, totally forgot these:

Reported-by: Qian Cai <cai@xxxxxx>
Reported-by: Andrea Arcangeli <aarcange@xxxxxxxxxx>

> Fixes: 5a811889de10 ("mm, compaction: use free lists to quickly locate a migration target")
> Cc: <stable@xxxxxxxxxxxxxxx>
> Signed-off-by: Vlastimil Babka <vbabka@xxxxxxx>

Also thanks David and Mel for the acks!

Thanks to Mike I was able to boot v5.11 in qemu with memmap containing a type 20
hole as Andrea reported, but can't reproduce the bug so far (i.e. without this
patch, with DEBUG_VM enabled) using transhuge-stress; might need some more
nuanced workload...