Re: [RFC 00/12] mm: PUD (1GB) THP implementation

From: Zi Yan

Date: Mon Feb 02 2026 - 10:58:41 EST


On 2 Feb 2026, at 6:30, Lorenzo Stoakes wrote:

> On Sun, Feb 01, 2026 at 09:44:12PM -0500, Rik van Riel wrote:
>> On Sun, 2026-02-01 at 16:50 -0800, Usama Arif wrote:
>>>
>>> 1. Static Reservation: hugetlbfs requires pre-allocating huge pages
>>> at boot
>>>    or runtime, taking memory away. This requires capacity planning,
>>>    administrative overhead, and makes workload orchastration much
>>> much more
>>>    complex, especially colocating with workloads that don't use
>>> hugetlbfs.
>>>
>> To address the obvious objection "but how could we
>> possibly allocate 1GB huge pages while the workload
>> is running?", I am planning to pick up the CMA balancing
>> patch series (thank you, Frank) and get that in an
>> upstream ready shape soon.
>>
>> https://lkml.org/2025/9/15/1735
>
> That link doesn't work?
>
> Did a quick search for CMA balancing on lore, couldn't find anything, could you
> provide a lore link?

https://lwn.net/Articles/1038263/

>
>>
>> That patch set looks like another case where no
>> amount of internal testing will find every single
>> corner case, and we'll probably just want to
>> merge it upstream, deploy it experimentally, and
>> aggressively deal with anything that might pop up.
>
> I'm not really in favour of this kind of approach. There's plenty of things that
> were considered 'temporary' upstream that became rather permanent :)
>
> Maybe we can't cover all corner-cases, but we need to make sure whatever we do
> send upstream is maintainable, conceptually sensible and doesn't paint us into
> any corners, etc.
>
>>
>> With CMA balancing, it would be possibly to just
>> have half (or even more) of system memory for
>> movable allocations only, which would make it possible
>> to allocate 1GB huge pages dynamically.
>
> Could you expand on that?

I also would like to hear David’s opinion on using CMA for 1GB THP.
He did not like it[1] when I posted my patch back in 2020, but it has
been more than 5 years. :)

The other direction I explored is to get 1GB THP from buddy allocator.
That means we need to:
1. bump MAX_PAGE_ORDER to 18 or make it a runtime variable so that only 1GB
THP users need to bump it,
2. handle cross memory section PFN merge in buddy allocator,
3. improve anti-fragmentation mechanism for 1GB range compaction.

1 is easier-ish[2]. I have not looked into 2 and 3 much yet.

[1] https://lore.kernel.org/all/52bc2d5d-eb8a-83de-1c93-abd329132d58@xxxxxxxxxx/
[2] https://lore.kernel.org/all/20210805190253.2795604-1-zi.yan@xxxxxxxx/


Best Regards,
Yan, Zi