Re: [PATCH] mm: page_isolation: Avoid hugepage scan step underflow

From: David Hildenbrand (Arm)

Date: Wed May 20 2026 - 05:25:22 EST


On 5/19/26 14:16, Kaitao Cheng wrote:
> From: Kaitao Cheng <chengkaitao@xxxxxxxxxx>
>
> page_is_unmovable() checks HugeTLB pages without holding hugetlb_lock and
> without pinning the folio. This is intentional for the pageblock scanning
> paths, but it means the HugeTLB folio can be freed concurrently after
> PageHuge() or folio_test_hugetlb() succeeds.
>
> The existing code avoids folio_hstate() and uses size_to_hstate() because
> the HugeTLB flag may already have been cleared. However, if
> size_to_hstate() returns NULL, the code still falls through and computes
> the scan step from folio_nr_pages(). If the folio has been freed and the
> head/large state has been cleared, folio_nr_pages() can return 1. When the
> current page is a tail page, subtracting folio_page_idx() from 1 can
> underflow and make the scanner skip too far.
>
> Treat a NULL hstate as unmovable so the scanner does not try to skip over
> an unstable HugeTLB folio. Once a valid hstate is found, derive the number
> of pages from the hstate instead of reading the folio size again. Also
> validate the page index before computing the step to avoid underflow if the
> page/folio relationship changed concurrently.
>
> Fixes: a0a9f2180b90 ("mm: page_isolation: avoid calling folio_hstate() without hugetlb_lock")
> Signed-off-by: Kaitao Cheng <chengkaitao@xxxxxxxxxx>
> ---
> mm/page_isolation.c | 12 ++++++++++--
> 1 file changed, 10 insertions(+), 2 deletions(-)
>
> diff --git a/mm/page_isolation.c b/mm/page_isolation.c
> index c48ff5c00244..99f0b06efaf6 100644
> --- a/mm/page_isolation.c
> +++ b/mm/page_isolation.c
> @@ -43,6 +43,7 @@ bool page_is_unmovable(struct zone *zone, struct page *page,
> */
> if (PageHuge(page) || PageCompound(page)) {
> struct folio *folio = page_folio(page);
> + unsigned long idx, nr_pages;
>
> if (folio_test_hugetlb(folio)) {
> struct hstate *h;
> @@ -55,14 +56,21 @@ bool page_is_unmovable(struct zone *zone, struct page *page,
> * use folio_hstate() directly.
> */
> h = size_to_hstate(folio_size(folio));
> - if (h && !hugepage_migration_supported(h))
> + if (!h || !hugepage_migration_supported(h))
> return true;
>
> + nr_pages = pages_per_huge_page(h);
> } else if (!folio_test_lru(folio)) {
> return true;
> + } else {
> + nr_pages = folio_nr_pages(folio);
> }
>
> - *step = folio_nr_pages(folio) - folio_page_idx(folio, page);
> + idx = folio_page_idx(folio, page);
> + if (idx >= nr_pages)
> + return true;
> +
> + *step = nr_pages - idx;
> return false;
> }
>

We have very similar code in scan_movable_pages() that we previously fixed.

Seems like we can avoid folio_page_idx() completely by just doing "pfn |=
nr_pages - 1".

If, in corner cases, we skip over some unmovable pages, too bad. The function is
inherently racy.

--
Cheers,

David