Re: [PATCH] mm: page_isolation: Avoid hugepage scan step underflow

From: Kaitao Cheng

Date: Fri May 22 2026 - 05:40:41 EST


在 2026/5/20 01:54, Andrew Morton 写道:
> On Tue, 19 May 2026 20:16:46 +0800 Kaitao Cheng <kaitao.cheng@xxxxxxxxx> wrote:
>
>> page_is_unmovable() checks HugeTLB pages without holding hugetlb_lock and
>> without pinning the folio. This is intentional for the pageblock scanning
>> paths, but it means the HugeTLB folio can be freed concurrently after
>> PageHuge() or folio_test_hugetlb() succeeds.
>
> Thanks. What are the userspace-visible runtime effects of this?

The direct user-visible effect is not memory corruption, but degraded forward
progress in the page isolation / contiguous allocation path.

If the race makes folio_nr_pages() return 1 while the current page is still
treated as a tail page of the old HugeTLB folio, the computed step can
underflow:

step = folio_nr_pages(folio) - folio_page_idx(folio, page);

The caller then does:

start_pfn += step;

With unsigned arithmetic this can wrap and move start_pfn backwards, typically
near the beginning of the old hugepage range rather than advancing past it. In
many cases this only means rescanning part of the same hugepage range, so the
effect may be limited to extra scanning work.

However, it still violates the scanner's forward-progress assumption: step
is expected to advance start_pfn. If the same transient state is observed
repeatedly, the scanner can keep revisiting the same PFNs, causing excessive
latency and, in the worst case, an apparent stall in operations that rely on
page isolation or contiguous allocation.


Here is another point raised by AI, the old code also used folio_test_lru()
on a folio pointer obtained without holding a reference. If the folio is
freed and the old head page is reused or observed as a tail page of another
compound page, folio_test_lru() can reach const_folio_flags(), which asserts
that the passed folio is not a tail page. On DEBUG_VM kernels, that can
trigger a VM_BUG_ON_PGFLAGS() crash.

>> The existing code avoids folio_hstate() and uses size_to_hstate() because
>> the HugeTLB flag may already have been cleared. However, if
>> size_to_hstate() returns NULL, the code still falls through and computes
>> the scan step from folio_nr_pages(). If the folio has been freed and the
>> head/large state has been cleared, folio_nr_pages() can return 1. When the
>> current page is a tail page, subtracting folio_page_idx() from 1 can
>> underflow and make the scanner skip too far.
>>
>> Treat a NULL hstate as unmovable so the scanner does not try to skip over
>> an unstable HugeTLB folio. Once a valid hstate is found, derive the number
>> of pages from the hstate instead of reading the folio size again. Also
>> validate the page index before computing the step to avoid underflow if the
>> page/folio relationship changed concurrently.
>
> This code sounds rather sketchy, and it sounds like it will remain
> sketchy after the patch. And AI review says "hey, this code is
> sketchy":
> https://sashiko.dev/#/patchset/20260519121646.40833-1-kaitao.cheng@xxxxxxxxx

Following David Hildenbrand's suggestion, I made some changes as shown below.
I'm not sure whether there are still any other issues.

--- a/mm/page_isolation.c
+++ b/mm/page_isolation.c
@@ -41,8 +41,14 @@ bool page_is_unmovable(struct zone *zone, struct page *page,
* We need not scan over tail pages because we don't
* handle each tail page individually in migration.
*/
- if (PageHuge(page) || PageCompound(page)) {
+ if (PageCompound(page)) {
struct folio *folio = page_folio(page);
+ unsigned long nr_pages, pfn;
+ unsigned int order;
+
+ order = compound_order(&folio->page);
+ if (order > MAX_FOLIO_ORDER)
+ return true;

if (folio_test_hugetlb(folio)) {
struct hstate *h;
@@ -54,15 +60,16 @@ bool page_is_unmovable(struct zone *zone, struct page *page,
* The huge page may be freed so can not
* use folio_hstate() directly.
*/
- h = size_to_hstate(folio_size(folio));
- if (h && !hugepage_migration_supported(h))
+ h = size_to_hstate(PAGE_SIZE << order);
+ if (!h || !hugepage_migration_supported(h))
return true;
-
- } else if (!folio_test_lru(folio)) {
+ } else if (!PageLRU(page)) {
return true;
}

- *step = folio_nr_pages(folio) - folio_page_idx(folio, page);
+ nr_pages = 1UL << order;
+ pfn = page_to_pfn(page);
+ *step = (pfn | (nr_pages - 1)) + 1 - pfn;
return false;
}

--
Thanks
Kaitao Cheng