Re: [PATCH v2 1/3] mm/page_alloc: Optimize free_contig_range()

From: Zi Yan

Date: Tue Mar 17 2026 - 11:27:32 EST

On 16 Mar 2026, at 12:19, Vlastimil Babka (SUSE) wrote:

> On 3/16/26 17:02, Zi Yan wrote:
>> On 16 Mar 2026, at 11:21, Vlastimil Babka wrote:
>>
>>>> +/*
>>>> + * free_pages_prepare() has already been called for page(s) being freed.
>>>> + * TODO: Perform per-subpage free_pages_prepare() checks for order > 0 pages
>>>> + * (HWPoison, PageNetpp, bad free page).
>>>> + */
>>>
>>> I'm confused, and reading the v1 thread didn't help either. Where would the
>>> subpages to check come from? AFAICS we start from order-0 pages always.
>>> __free_contig_range calls free_pages_prepare on every page with order 0
>>> unconditionally, so we check every page as an order-0 page. If we then free
>>> the bunch of individually checked pages as a high-order page, there's no
>>> reason to check those subpages again, no? Am I missing something?
>>
>> There are two kinds of order > 0 pages, compound and not compound.
>> free_pages_prepare() checks all tail pages of a compound order > 0 pages too.
>> For non compound ones, free_pages_prepare() only has free_page_is_bad()
>> check on tail ones.
>>
>> So my guess is that the TODO is to check all subpages on a non compound
>> order > 0 one in the same manner. This is based on the assumption that
>
> OK but:
>
> 1) Why put that TODO specifically on FPI_PREPARED definition, which is for
> the case we skip the prepare/check?
> 2) Why add it in this series which AFAICS doesn't handle non-compound
> order>0 anywhere.
> 3) We'd better work on eliminating the non-compound order>0 usages
> altogether, rather than work on support them better.

I agreed with you when I first saw this. After I think about it again,
the issue might not be directly related to the allocation but is the free path.
Like the patch title said, it is an optimization of free contiguous pages.
These physically contiguous pages happen to come from alloc non-compound order>0
and this leads to this optimization.

The problem they want to solve is to speed up page free path by freeing
a group of pages together. They are optimizing for a special situation
where a group of pages that are physically contiguous, so that these pages
can be freed via free_pages(page, order /* > 0 */). If we take away
the allocation of non-compound order>0, like you suggested in 3, we basically
remove the optimization opportunity from them. I am not sure that what
people want.

To think about the problem broadly, how can we optimize free_page_bulk(),
if that exists? Sorting input pages based on PFNs, so that we can them in
high orders instead of individual order-0s. This patch basically says,
hey, the group of pages we are freeing are all contiguous, since that is
how we allocate them, freeing them as a whole is much quicker than freeing
them individually.

>
>> all non compound order > 0 page users use split_page() after the allocation,
>> treat each page individually, and free them back altogether. But I am not
>> sure if this is true for all users allocating non compound order > 0 pages.
>
> Maybe as part of the elimination (point 3 above) we should combine the
> allocation+split so it's never the first without the second anymore.
>
>> And free_pages_prepare_bulk() might be a better name for such functions.
>>
>> The above confusion is also a reason I asked Ryan to try adding a unsplit_page()
>> function to fuse back non compound order > 0 pages and free the fused one
>> as we are currently doing. But that looks like a pain to implment. Maybe an
>
> Yeah not sure it's worth it either.
>
>> alternative to this FPI_PREPARED is to add FPI_FREE_BULK and loop through all
>> subpages if FPI_FREE_BULK is set with
>> __free_pages_prepare(page + i, 0, fpi_flags & ~FPI_FREE_BULK) in
>> __free_pages_ok().
>
> Hmm, maybe...

Let me know if my reasoning above moves your opinion on FPI_FREE_BULK towards
a positive direction. :)

Best Regards,
Yan, Zi