[RFC PATCH 32/40] mm: debug: prevent infinite recursion in dump_page() with CMA
From: Rik van Riel
Date: Wed May 20 2026 - 11:17:46 EST
dump_page() calls is_migrate_cma_folio() which expands to
get_pfnblock_migratetype(&folio->page, pfn). That helper resolves
the pageblock via pfn_to_pageblock(), and on !CONFIG_SPARSEMEM
configurations pfn_to_pageblock() reads page_zone(page) to compute
the per-zone pageblock_data offset.
When dump_page() is invoked on a page whose zone is not initialised
(unavailable PFN ranges, very early boot, or a poisoned struct page),
the page_zone() dereference returns garbage and a downstream
VM_BUG_ON_PAGE in dump_page()'s own consistency checks fires. The
BUG handler then calls dump_page() on the same page, which re-enters
the same code path, hits the same BUG, and recurses until the kernel
runs out of stack.
Guard the is_migrate_cma_folio() call with pfn_valid() and only
resolve page_zone() once that has succeeded; only then run
zone_spans_pfn() before classifying the page. dump_page() can now
safely report on pages without a meaningful zone, and the "CMA"
suffix is only printed if the page is genuinely in a CMA pageblock.
Found by: dump_page() called from a VM_BUG_ON_PAGE in early boot
hitting a page in an unavailable range, recursing until stack
exhaustion.
Signed-off-by: Rik van Riel <riel@xxxxxxxxxxx>
Assisted-by: Claude:claude-opus-4.7 syzkaller
---
mm/debug.c | 18 +++++++++++++++++-
1 file changed, 17 insertions(+), 1 deletion(-)
diff --git a/mm/debug.c b/mm/debug.c
index d4542d5d202b..e233520b009c 100644
--- a/mm/debug.c
+++ b/mm/debug.c
@@ -73,6 +73,7 @@ static void __dump_folio(const struct folio *folio, const struct page *page,
{
struct address_space *mapping = folio_mapping(folio);
int mapcount = atomic_read(&page->_mapcount) + 1;
+ bool cma = false;
char *type = "";
if (page_mapcount_is_type(mapcount))
@@ -112,9 +113,24 @@ static void __dump_folio(const struct folio *folio, const struct page *page,
* "isolate" again in the meantime, but since we are just dumping the
* state for debugging, it should be fine to accept a bit of
* inaccuracy here due to racing.
+ *
+ * Guard the is_migrate_cma_folio() call with pfn_valid() and
+ * zone_spans_pfn(). The macro calls get_pfnblock_migratetype()
+ * which calls get_pfnblock_flags_word() which has a VM_BUG_ON_PAGE
+ * for !zone_spans_pfn(). If that fires, dump_page() recurses
+ * infinitely. Call page_zone() only after pfn_valid() to avoid
+ * dereferencing uninitialized zone data during early boot.
*/
+#ifdef CONFIG_CMA
+ if (pfn_valid(pfn)) {
+ struct zone *zone = page_zone(page);
+
+ if (zone_spans_pfn(zone, pfn))
+ cma = is_migrate_cma_folio(folio, pfn);
+ }
+#endif
pr_warn("%sflags: %pGp%s\n", type, &folio->flags,
- is_migrate_cma_folio(folio, pfn) ? " CMA" : "");
+ cma ? " CMA" : "");
if (page_has_type(&folio->page))
pr_warn("page_type: %x(%s)\n", folio->page.page_type >> 24,
page_type_name(folio->page.page_type));
--
2.54.0