Re: [RFC 00/12] mm: PUD (1GB) THP implementation
From: David Hildenbrand (arm)
Date: Thu Feb 05 2026 - 06:22:29 EST
On 2/2/26 16:50, Zi Yan wrote:
On 2 Feb 2026, at 6:30, Lorenzo Stoakes wrote:
On Sun, Feb 01, 2026 at 09:44:12PM -0500, Rik van Riel wrote:
To address the obvious objection "but how could we
possibly allocate 1GB huge pages while the workload
is running?", I am planning to pick up the CMA balancing
patch series (thank you, Frank) and get that in an
upstream ready shape soon.
https://lkml.org/2025/9/15/1735
That link doesn't work?
Did a quick search for CMA balancing on lore, couldn't find anything, could you
provide a lore link?
https://lwn.net/Articles/1038263/
That patch set looks like another case where no
amount of internal testing will find every single
corner case, and we'll probably just want to
merge it upstream, deploy it experimentally, and
aggressively deal with anything that might pop up.
I'm not really in favour of this kind of approach. There's plenty of things that
were considered 'temporary' upstream that became rather permanent :)
Maybe we can't cover all corner-cases, but we need to make sure whatever we do
send upstream is maintainable, conceptually sensible and doesn't paint us into
any corners, etc.
With CMA balancing, it would be possibly to just
have half (or even more) of system memory for
movable allocations only, which would make it possible
to allocate 1GB huge pages dynamically.
Could you expand on that?
I also would like to hear David’s opinion on using CMA for 1GB THP.
He did not like it[1] when I posted my patch back in 2020, but it has
been more than 5 years. :)
Hehe, not particularly excited about that.
We really have to avoid short-term hacks by any means. We have enough of that in THP land already.
We talked about challenges in the past like:
* Controlling who gets to allocate them.
* Having a reasonable swap/migration mechanism
* Reliably allocating them without hacks, while being future-proof
* Long-term pinning them when they are actually on ZONE_MOVABLE or CMA
(the latter could be made working but requires thought)
I agree with Lorenzo that this RFC is a bit surprising, because I assume none of the real challenges were tackled.
Having that said, it will take me some time to come back to this RFC here, other stuff that piled up is more urgent and more important.
But I'll note that we really have to cleanup the THP mess before we add more stuff on it.
For example, I still wonder whether we can just stop pre-allocating page tables for THPs and instead let code fail+retry in case we cannot remap the page. I wanted to look into the details a long time ago but never got to it.
Avoiding that would make the remapping much easier; and we should then remap from PUD->PMD->PTEs.
Implementing 1 GiB support for shmem might be a reasonable first step, before we start digging into the anonymous memory land with all these nasty things involved.
--
Cheers,
David