Re: [RFC mm v6] mm: introduce a new page type for page pool in page type

From: Toke Høiland-Jørgensen
Date: Tue Nov 18 2025 - 05:32:38 EST


Jesper Dangaard Brouer <hawk@xxxxxxxxxx> writes:

> On 18/11/2025 02.18, Byungchul Park wrote:
>> On Tue, Nov 18, 2025 at 10:07:35AM +0900, Byungchul Park wrote:
>>> On Mon, Nov 17, 2025 at 05:47:05PM +0100, David Hildenbrand (Red Hat) wrote:
>>>> On 17.11.25 17:02, Jesper Dangaard Brouer wrote:
>>>>>
>>>>> On 17/11/2025 06.20, Byungchul Park wrote:
>>>>>> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
>>>>>> index 600d9e981c23..01dd14123065 100644
>>>>>> --- a/mm/page_alloc.c
>>>>>> +++ b/mm/page_alloc.c
>>>>>> @@ -1041,7 +1041,6 @@ static inline bool page_expected_state(struct page *page,
>>>>>> #ifdef CONFIG_MEMCG
>>>>>> page->memcg_data |
>>>>>> #endif
>>>>>> - page_pool_page_is_pp(page) |
>>>>>> (page->flags.f & check_flags)))
>>>>>> return false;
>>>>>>
>>>>>> @@ -1068,8 +1067,6 @@ static const char *page_bad_reason(struct page *page, unsigned long flags)
>>>>>> if (unlikely(page->memcg_data))
>>>>>> bad_reason = "page still charged to cgroup";
>>>>>> #endif
>>>>>> - if (unlikely(page_pool_page_is_pp(page)))
>>>>>> - bad_reason = "page_pool leak";
>>>>>> return bad_reason;
>>>>>> }
>>>>>
>>>>> This code have helped us catch leaks in the past.
>>>>> When this happens the result is that the page is marked as a bad page.
>>>>>
>>>>>>
>>>>>> @@ -1378,9 +1375,12 @@ __always_inline bool free_pages_prepare(struct page *page,
>>>>>> mod_mthp_stat(order, MTHP_STAT_NR_ANON, -1);
>>>>>> folio->mapping = NULL;
>>>>>> }
>>>>>> - if (unlikely(page_has_type(page)))
>>>>>> + if (unlikely(page_has_type(page))) {
>>>>>> + /* networking expects to clear its page type before releasing */
>>>>>> + WARN_ON_ONCE(PageNetpp(page));
>>>>>> /* Reset the page_type (which overlays _mapcount) */
>>>>>> page->page_type = UINT_MAX;
>>>>>> + }
>>>>>>
>>>>>> if (is_check_pages_enabled()) {
>>>>>> if (free_page_is_bad(page))
>>>>>
>>>>> What happens to the page? ... when it gets marked with:
>>>>> page->page_type = UINT_MAX
>>>>>
>>>>> Will it get freed and allowed to be used by others?
>>>>> - if so it can result in other hard-to-catch bugs
>>>>
>>>> Yes, just like most other use-after-free from any other subsystem in the
>>>> kernel :)
>>>>
>>>> The expectation is that such BUGs are found early during testing
>>>> (triggering a WARN) such that they can be fixed early.
>>>>
>>>> But we could also report a bad page here and just stop (return false).
>
> I agree, that we want to catch these bugs early by triggering a WARN.
> The bad_page() call also triggers dump_stack() and have a burst limiter,
> which I like. We are running with CONFIG_DEBUG_VM=y in production (as
> the measured overhead was minimal) to monitor these kind of leaks.
>
> For the case with page_pool, we *could* recover more gracefully, by
> returning the page to the page_pool (page->pp) instance. But I'm
> reluctant to taking this path, as that puts less pressure on fixing the
> leak as we "recovered", as this becomes are warning and not a bug.
> Opinions are welcomed, should we recover or do bad_page() ?

I think we should do bad_page() to get the bugs fixed :)

-Toke