[PATCH] mm: page_alloc: fix CMA and HIGHATOMIC landing on the wrong buddy list

From: Johannes Weiner
Date: Fri Jul 28 2023 - 14:29:41 EST


Commit 4b23a68f9536 ("mm/page_alloc: protect PCP lists with a
spinlock") bypasses the pcplist on lock contention and returns the
page directly to the buddy list of the page's migratetype.

For pages that don't have their own pcplist, such as CMA and
HIGHATOMIC, the migratetype is temporarily updated such that the page
can hitch a ride on the MOVABLE pcplist. Their true type is later
reassessed when flushing in free_pcppages_bulk(). However, when lock
contention is detected after the type was already overriden, the
bypass will then put the page on the wrong buddy list.

Once on the MOVABLE buddy list, the page becomes eligible for
fallbacks and even stealing. In the case of HIGHATOMIC, otherwise
ineligible allocations can dip into the highatomic reserves. In the
case of CMA, the page can be lost from the CMA region permanently.

Use a separate pcpmigratetype variable for the pcplist override. Use
the original migratetype when going directly to the buddy. This fixes
the bug and should make the intentions more obvious in the code.

Originally sent here to address the HIGHATOMIC case:
https://lore.kernel.org/lkml/20230821183733.106619-4-hannes@xxxxxxxxxxx/

Changelog updated in response to the CMA-specific bug report.

[mgorman@xxxxxxxxxxxxxxxxxxx: updated changelog]
Reported-by: Joe Liu <joe.liu@xxxxxxxxxxxx>
Fixes: 4b23a68f9536 ("mm/page_alloc: protect PCP lists with a spinlock")
Cc: stable@xxxxxxxxxxxxxxx
Signed-off-by: Johannes Weiner <hannes@xxxxxxxxxxx>
---
mm/page_alloc.c | 12 ++++++------
1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 0c5be12f9336..95546f376302 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -2400,7 +2400,7 @@ void free_unref_page(struct page *page, unsigned int order)
struct per_cpu_pages *pcp;
struct zone *zone;
unsigned long pfn = page_to_pfn(page);
- int migratetype;
+ int migratetype, pcpmigratetype;

if (!free_unref_page_prepare(page, pfn, order))
return;
@@ -2408,24 +2408,24 @@ void free_unref_page(struct page *page, unsigned int order)
/*
* We only track unmovable, reclaimable and movable on pcp lists.
* Place ISOLATE pages on the isolated list because they are being
- * offlined but treat HIGHATOMIC as movable pages so we can get those
- * areas back if necessary. Otherwise, we may have to free
+ * offlined but treat HIGHATOMIC and CMA as movable pages so we can
+ * get those areas back if necessary. Otherwise, we may have to free
* excessively into the page allocator
*/
- migratetype = get_pcppage_migratetype(page);
+ migratetype = pcpmigratetype = get_pcppage_migratetype(page);
if (unlikely(migratetype >= MIGRATE_PCPTYPES)) {
if (unlikely(is_migrate_isolate(migratetype))) {
free_one_page(page_zone(page), page, pfn, order, migratetype, FPI_NONE);
return;
}
- migratetype = MIGRATE_MOVABLE;
+ pcpmigratetype = MIGRATE_MOVABLE;
}

zone = page_zone(page);
pcp_trylock_prepare(UP_flags);
pcp = pcp_spin_trylock(zone->per_cpu_pageset);
if (pcp) {
- free_unref_page_commit(zone, pcp, page, migratetype, order);
+ free_unref_page_commit(zone, pcp, page, pcpmigratetype, order);
pcp_spin_unlock(pcp);
} else {
free_one_page(zone, page, pfn, order, migratetype, FPI_NONE);
--
2.42.0