Re: [PATCHv3 10/15] mm/hugetlb: Remove fake head pages

From: David Hildenbrand (Red Hat)

Date: Thu Jan 15 2026 - 14:46:09 EST


On 1/15/26 20:33, David Hildenbrand (Red Hat) wrote:
On 1/15/26 19:58, Kiryl Shutsemau wrote:
On Thu, Jan 15, 2026 at 06:41:44PM +0100, David Hildenbrand (Red Hat) wrote:
On 1/15/26 18:23, Kiryl Shutsemau wrote:
On Thu, Jan 15, 2026 at 05:49:43PM +0100, David Hildenbrand (Red Hat) wrote:
On 1/15/26 15:45, Kiryl Shutsemau wrote:
HugeTLB Vmemmap Optimization (HVO) reduces memory usage by freeing most
vmemmap pages for huge pages and remapping the freed range to a single
page containing the struct page metadata.

With the new mask-based compound_info encoding (for power-of-2 struct
page sizes), all tail pages of the same order are now identical
regardless of which compound page they belong to. This means the tail
pages can be truly shared without fake heads.

Allocate a single page of initialized tail struct pages per NUMA node
per order in the vmemmap_tails[] array in pglist_data. All huge pages
of that order on the node share this tail page, mapped read-only into
their vmemmap. The head page remains unique per huge page.

This eliminates fake heads while maintaining the same memory savings,
and simplifies compound_head() by removing fake head detection.

Signed-off-by: Kiryl Shutsemau <kas@xxxxxxxxxx>
---
include/linux/mmzone.h | 16 ++++++++++++++-
mm/hugetlb_vmemmap.c | 44 ++++++++++++++++++++++++++++++++++++++++--
mm/sparse-vmemmap.c | 44 ++++++++++++++++++++++++++++++++++--------
3 files changed, 93 insertions(+), 11 deletions(-)

diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
index 322ed4c42cfc..2ee3eb610291 100644
--- a/include/linux/mmzone.h
+++ b/include/linux/mmzone.h
@@ -82,7 +82,11 @@
* currently expect (see CONFIG_HAVE_GIGANTIC_FOLIOS): with hugetlb, we expect
* no folios larger than 16 GiB on 64bit and 1 GiB on 32bit.
*/
-#define MAX_FOLIO_ORDER get_order(IS_ENABLED(CONFIG_64BIT) ? SZ_16G : SZ_1G)
+#ifdef CONFIG_64BIT
+#define MAX_FOLIO_ORDER (34 - PAGE_SHIFT)
+#else
+#define MAX_FOLIO_ORDER (30 - PAGE_SHIFT)
+#endif

Where do these magic values stem from, and how do they related to the
comment above that clearly spells out 16G vs. 1G ?

This doesn't change the resulting value: 1UL << 34 is 16GiB, 1UL << 30
is 1G. Subtract PAGE_SHIFT to get the order.

The change allows the value to be used to define NR_VMEMMAP_TAILS which
is used specify size of vmemmap_tails array.

get_order(IS_ENABLED(CONFIG_64BIT) ? SZ_16G : SZ_1G) should evaluate to a
constant by the compiler.

See __builtin_constant_p handling in get_order().

If that is not working then we have to figure out why.

asm-offsets.s compilation fails:

../include/linux/mmzone.h:1574:16: error: fields must have a constant size:
'variable length array in structure' extension will never be supported
1574 | unsigned long vmemmap_tails[NR_VMEMMAP_TAILS];

Here's how preprocessor dump of vmemmap_tails looks like:

unsigned long vmemmap_tails[(get_order(1 ? (0x400000000ULL) : 0x40000000) - (( __builtin_constant_p(2 * ((1UL) << 12) / sizeof(struct page)) ? ((2 * ((1UL) << 12) / sizeof(struct page)) < 2 ? 0 : 63 - __builtin_clzll(2 * ((1UL) << 12) / sizeof(struct page))) : (sizeof(2 * ((1UL) << 12) / sizeof(struct page)) <= 4) ? __ilog2_u32(2 * ((1UL) << 12) / sizeof(struct page)) : __ilog2_u64(2 * ((1UL) << 12) / sizeof(struct page)) )) + 1)];

And here's get_order():

static inline __attribute__((__gnu_inline__)) __attribute__((__unused__)) __attribute__((no_instrument_function)) __attribute__((__always_inline__)) __attribute__((__const__)) int get_order(unsigned long size)
{
if (__builtin_constant_p(size)) {
if (!size)
return 64 - 12;

if (size < (1UL << 12))
return 0;

return ( __builtin_constant_p((size) - 1) ? (((size) - 1) < 2 ? 0 : 63 - __builtin_clzll((size) - 1)) : (sizeof((size) - 1) <= 4) ? __ilog2_u32((size) - 1) : __ilog2_u64((size) - 1) ) - 12 + 1;
}

size--;
size >>= 12;



return fls64(size);

}

I am not sure why it is not compile-time constant. I have not dig
deeper.

Very weird. Almost sounds like a bug given that get_order() ends up using ilog2.

But it gets even weirder:

diff --git a/include/linux/mm.h b/include/linux/mm.h
index 6f959d8ca4b42..a54445682ccc4 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -2281,6 +2281,9 @@ static inline unsigned long folio_nr_pages(const struct folio *folio)
* no folios larger than 16 GiB on 64bit and 1 GiB on 32bit.
*/
#define MAX_FOLIO_ORDER get_order(IS_ENABLED(CONFIG_64BIT) ? SZ_16G : SZ_1G)
+
+static_assert(__builtin_constant_p(MAX_FOLIO_ORDER));
+
#else
/*
* Without hugetlb, gigantic folios that are bigger than a single PUD are

gives me


./include/linux/build_bug.h:78:41: error: static assertion failed: "__builtin_constant_p(MAX_FOLIO_ORDER)"
78 | #define __static_assert(expr, msg, ...) _Static_assert(expr, msg)
| ^~~~~~~~~~~~~~
./include/linux/build_bug.h:77:34: note: in expansion of macro '__static_assert'
77 | #define static_assert(expr, ...) __static_assert(expr, ##__VA_ARGS__, #expr)
| ^~~~~~~~~~~~~~~
./include/linux/mm.h:2285:1: note: in expansion of macro 'static_assert'
2285 | static_assert(__builtin_constant_p(MAX_FOLIO_ORDER));
| ^~~~~~~~~~~~~

And reversing the condition fixes it.

... so it is a constant? Huh?

I've been staring at the computer for too long, this is not BUILD_BUG semantics. So we don't get a constant.

For some reason :)

Even when I just use get_order(4096).

--
Cheers

David