Re: [RFC PATCH 0/8] Introducte Reserved THP
From: Zi Yan
Date: Tue Jun 30 2026 - 19:34:28 EST
On Tue Jun 30, 2026 at 6:59 PM EDT, Barry Song wrote:
> On Mon, Jun 29, 2026 at 8:20 PM David Hildenbrand (Arm)
> <david@xxxxxxxxxx> wrote:
> [...]
>> >
>> > 2. Implementation
>> > =================
>> >
>> > In 2024, Yu Zhao proposed a similar idea:
>> >
>> > Link: https://lore.kernel.org/all/20240229183436.4110845-2-yuzhao@xxxxxxxxxx/
>> >
>> > The idea was to introduce two virt zones: ZONE_NOSPLIT and ZONE_NOMERGE to
>> > guarantee the allocation success rate of THP, achieving an effect similar to
>> > reservation. However, it seems there was no further progress, perhaps because of
>> > reluctance to introduce more virt zones like ZONE_MOVABLE.
>> >
>> > This RFC wants to discuss another implementation:
>> >
>> > 1. Introduce a new migratetype: MIGRATE_RESERVED_THP.
>> > 2. Introduce two new hugetlb-like kernel boot parameters: `thp_reserved_size`
>> > and `thp_reserved_nr`. When set, the required memory is marked as
>> > MIGRATE_RESERVED_THP and put back into the buddy allocator.
>>
>> I'm all for some mechanism to make runtime allocation of large chunks of memory
>> easier, by adding a pool from where multiple consumers (THP, guest_memfd,
>> hugetlb, whatever) can allocate memory.
>>
>> Call me very skeptical of getting the page allocator involved like this. (I hate it)
>
> One thing we've been thinking about for a while is whether we can
> introduce something at the pageblock level to let memory "remember"
> which allocation order is preferred within that pageblock.
>
> For example, if we ever allocate an order-0 page from pageblock 100,
> that pageblock would later prefer order-0 allocations. Similarly, if
> we allocate a large folio from pageblock 200, we would avoid using
> pageblock 200 for order-0 allocations as long as there is still
> memory available in pageblock 100 for order-0.
>
> Since order-0 allocations are often the main source of fragmentation,
> if we already have both pagecache and anonymous large folios, we may
> care more about containing or quarantining order-0 allocations in
> certain areas, rather than trying to maintain a large-folio pool or
> similar strategy.
Aren't unmovable pages causing fragmentation? For movable pages,
regardless of their orders, they can always be migrated if no additional
pin is present.
If we use per-order pageblocks, how to use pageblocks with rarely used
orders? Allowing lower order to fallback to higher order pageblocks?
>
> Chris’s de-fragmentation of swap slots[1] seems to be a big success
> based on my observations, where he provides a similar memory-order
> preference for swap clusters. There is no reservation mechanism, no
> sysfs knob, and no need to split swap into two areas—everything
> just works automatically.
>
> I wonder if you would be interested in something similar at the
> pageblock level. If so, I’d be happy to work on a prototype in
> August. I’m completely booked in July.
>
> [1] https://lore.kernel.org/all/20240730-swap-allocator-v5-0-cb9c148b9297@xxxxxxxxxx/
>
I feel that swap and page allocation have a fundamental distintion,
where swap slots are not movable, but pages can. Memory compaction can
move pages around to make space for high order allocations, but does
swap support something similar? How will page mobility work in this swap
slot defragmentation world?
In addition, when swap space is full, or only order-0 swap slots are
available but higher order folios want to be swapped out, folio swap
might simply stop (except splitting folios to fill the order-0 slots).
But for page allocation, some pages can be reclaimed/swapped to make
space and this adds complexity.
--
Best Regards,
Yan, Zi