[RFC PATCH 42/45] mm: page_alloc: cross-MOV borrow within tainted SPBs

From: Rik van Riel

Date: Thu Apr 30 2026 - 16:32:02 EST


From: Rik van Riel <riel@xxxxxxxx>

Pass 2c (cross-non-movable borrow) is restricted to UNMOV<->RECL: it
borrows individual buddies from the opposite non-movable migratetype's
free list within a tainted SPB without relabeling the source pageblock.
Movable free pages within tainted SPBs are deliberately excluded
because long-lived non-movable content in a MOV-tagged pageblock
blocks compaction of that pageblock.

Under workloads that mostly free MOVABLE-tagged content into tainted
SPBs (page-cache reclaim, anon LRU shrink), the result is a tainted
SPB with tens to hundreds of thousands of free pages all on the MOV
free list — invisible to non-movable demand. Pass 1 doesn't see them
(they're not on the requesting mt's list), Pass 2/2b can't claim a
whole pageblock when sb->nr_free == 0 (no contiguous free PB to
relabel), and Pass 2c skips MOV. The non-movable alloc falls through
to Pass 3 and taints a fresh clean SPB even though the existing
tainted ones have plenty of unused space.

Add Pass 2d, mirroring Pass 2c semantics but borrowing from the
MOVABLE free list within already-tainted SPBs. The borrowed page is
used for the requesting non-movable mt for the lifetime of the
allocation, then on free returns to the MOVABLE list (no pageblock
relabel; same "borrow" mechanism as 2c).

Tradeoff: the borrowed UNMOV/RECL content blocks compaction of its
source pageblock until the alloc is freed. Restricted to SB_TAINTED
so contamination is bounded to one pageblock inside an already-
tainted SPB. The alternative — Pass 3 tainting a fresh clean SPB —
removes a 1 GiB region from the clean pool, which is strictly worse
for the anti-fragmentation invariant the series is built around.

Skipped for movable allocs (they use Pass 4) and CMA allocs.

Observable as the new SPB_ALLOC_OUTCOME_PASS_2D outcome on the
spb_alloc_walk tracepoint. Expected effect on the live workload:
tainted SPB count growth slows substantially; allocations that were
previously taking the PASS_3 escape now succeed in PASS_2D.

Signed-off-by: Rik van Riel <riel@xxxxxxxxxxx>
Assisted-by: Claude:claude-opus-4.7 syzkaller
---
mm/page_alloc.c | 73 +++++++++++++++++++++++++++++++++++++++++++++++++
1 file changed, 73 insertions(+)

diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 2f5d3ba1c0ef..af499f0a1a48 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -3280,6 +3280,79 @@ struct page *__rmqueue_smallest(struct zone *zone, unsigned int order,
}
}
}
+
+ /*
+ * Pass 2d: cross-MOV borrow within tainted SPBs.
+ *
+ * If Pass 1/2/2b/2c all failed, the next step is Pass 3
+ * which would taint a fresh clean SPB. Before that, try
+ * to borrow an individual buddy from a tainted SPB's
+ * MIGRATE_MOVABLE free list.
+ *
+ * Tainted SPBs accumulate large amounts of free space on
+ * the MOV free list (e.g. reclaimed page-cache pages
+ * whose pageblock tag is MOVABLE). Pass 1 cannot see
+ * those for non-movable allocs, Pass 2/2b cannot claim a
+ * whole pageblock when sb->nr_free == 0, and Pass 2c is
+ * restricted to UNMOV<->RECL. The result is a tainted
+ * SPB with tens to hundreds of thousands of free pages
+ * all unreachable from non-movable demand.
+ *
+ * Borrow semantics mirror Pass 2c: take a buddy from the
+ * MOVABLE free list without relabeling the source
+ * pageblock. The page is used for the requesting non-
+ * movable mt for the lifetime of the allocation, then on
+ * free returns to the MOVABLE list.
+ *
+ * Cost: the borrowed UNMOV/RECL content blocks
+ * compaction of its source pageblock until freed.
+ * Restricted to SB_TAINTED so the contamination is
+ * bounded to an already-tainted SPB; the alternative
+ * (Pass 3) taints a fresh clean SPB and removes a 1 GiB
+ * region from the clean pool, which is strictly worse.
+ *
+ * Skipped for movable allocs (they have Pass 4) and for
+ * CMA allocs.
+ */
+ if (!movable && !is_migrate_cma(migratetype)) {
+ for (full = SB_FULL; full < __NR_SB_FULLNESS; full++) {
+ list_for_each_entry(sb,
+ &zone->spb_lists[SB_TAINTED][full], list) {
+ int co;
+
+ if (!sb->nr_free_pages)
+ continue;
+ for (co = min_t(int, pageblock_order - 1,
+ NR_PAGE_ORDERS - 1);
+ co >= (int)order;
+ --co) {
+ current_order = co;
+ area = &sb->free_area[current_order];
+ page = get_page_from_free_area(
+ area, MIGRATE_MOVABLE);
+ if (!page)
+ continue;
+ if (get_pageblock_isolate(page))
+ continue;
+ if (is_migrate_cma(
+ get_pageblock_migratetype(page)))
+ continue;
+ page_del_and_expand(zone, page,
+ order, current_order,
+ MIGRATE_MOVABLE);
+ __spb_set_has_type(page,
+ migratetype);
+ if (spb_below_shrink_high_water(sb))
+ queue_spb_slab_shrink(zone);
+ trace_mm_page_alloc_zone_locked(
+ page, order, migratetype,
+ pcp_allowed_order(order) &&
+ migratetype < MIGRATE_PCPTYPES);
+ return page;
+ }
+ }
+ }
+ }
}

/*
--
2.52.0