Re: [RFC mm v6] mm: introduce a new page type for page pool in page type
From: Jesper Dangaard Brouer
Date: Tue Nov 18 2025 - 04:42:46 EST
On 18/11/2025 02.18, Byungchul Park wrote:
On Tue, Nov 18, 2025 at 10:07:35AM +0900, Byungchul Park wrote:
On Mon, Nov 17, 2025 at 05:47:05PM +0100, David Hildenbrand (Red Hat) wrote:
On 17.11.25 17:02, Jesper Dangaard Brouer wrote:
On 17/11/2025 06.20, Byungchul Park wrote:
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 600d9e981c23..01dd14123065 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -1041,7 +1041,6 @@ static inline bool page_expected_state(struct page *page,
#ifdef CONFIG_MEMCG
page->memcg_data |
#endif
- page_pool_page_is_pp(page) |
(page->flags.f & check_flags)))
return false;
@@ -1068,8 +1067,6 @@ static const char *page_bad_reason(struct page *page, unsigned long flags)
if (unlikely(page->memcg_data))
bad_reason = "page still charged to cgroup";
#endif
- if (unlikely(page_pool_page_is_pp(page)))
- bad_reason = "page_pool leak";
return bad_reason;
}
This code have helped us catch leaks in the past.
When this happens the result is that the page is marked as a bad page.
@@ -1378,9 +1375,12 @@ __always_inline bool free_pages_prepare(struct page *page,
mod_mthp_stat(order, MTHP_STAT_NR_ANON, -1);
folio->mapping = NULL;
}
- if (unlikely(page_has_type(page)))
+ if (unlikely(page_has_type(page))) {
+ /* networking expects to clear its page type before releasing */
+ WARN_ON_ONCE(PageNetpp(page));
/* Reset the page_type (which overlays _mapcount) */
page->page_type = UINT_MAX;
+ }
if (is_check_pages_enabled()) {
if (free_page_is_bad(page))
What happens to the page? ... when it gets marked with:
page->page_type = UINT_MAX
Will it get freed and allowed to be used by others?
- if so it can result in other hard-to-catch bugs
Yes, just like most other use-after-free from any other subsystem in the
kernel :)
The expectation is that such BUGs are found early during testing
(triggering a WARN) such that they can be fixed early.
But we could also report a bad page here and just stop (return false).
I agree, that we want to catch these bugs early by triggering a WARN.
The bad_page() call also triggers dump_stack() and have a burst limiter,
which I like. We are running with CONFIG_DEBUG_VM=y in production (as
the measured overhead was minimal) to monitor these kind of leaks.
For the case with page_pool, we *could* recover more gracefully, by
returning the page to the page_pool (page->pp) instance. But I'm
reluctant to taking this path, as that puts less pressure on fixing the
leak as we "recovered", as this becomes are warning and not a bug.
Opinions are welcomed, should we recover or do bad_page() ?
I think the WARN_ON_ONCE() makes the problematic situation detectable.
However, if we should prevent the page from being used on the detection,
sure, I can update the patch.
I will respin with the following diff folded on the top.
LGTM
Byungchul
---
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 01dd14123065..5ae55a5d7b5d 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -1377,7 +1377,10 @@ __always_inline bool free_pages_prepare(struct page *page,
}
if (unlikely(page_has_type(page))) {
/* networking expects to clear its page type before releasing */
- WARN_ON_ONCE(PageNetpp(page));
+ if (unlikely(PageNetpp(page))) {
+ bad_page(page, "page_pool leak");
+ return false;
+ }
/* Reset the page_type (which overlays _mapcount) */
page->page_type = UINT_MAX;
}
Thanks,
Byungchul
--
Cheers
David
--Jesper