Re: [PATCHv2 06/14] mm: Rework compound_head() for power-of-2 sizeof(struct page)

From: Muchun Song

Date: Mon Dec 22 2025 - 04:46:03 EST




> On Dec 22, 2025, at 15:57, Muchun Song <muchun.song@xxxxxxxxx> wrote:
>
>
>
>> On Dec 18, 2025, at 23:09, Kiryl Shutsemau <kas@xxxxxxxxxx> wrote:
>>
>> For tail pages, the kernel uses the 'compound_info' field to get to the
>> head page. The bit 0 of the field indicates whether the page is a
>> tail page, and if set, the remaining bits represent a pointer to the
>> head page.
>>
>> For cases when size of struct page is power-of-2, change the encoding of
>> compound_info to store a mask that can be applied to the virtual address
>> of the tail page in order to access the head page. It is possible
>> because struct page of the head page is naturally aligned with regards
>> to order of the page.
>>
>> The significant impact of this modification is that all tail pages of
>> the same order will now have identical 'compound_info', regardless of
>> the compound page they are associated with. This paves the way for
>> eliminating fake heads.
>>
>> The HugeTLB Vmemmap Optimization (HVO) creates fake heads and it is only
>> applied when the sizeof(struct page) is power-of-2. Having identical
>> tail pages allows the same page to be mapped into the vmemmap of all
>> pages, maintaining memory savings without fake heads.
>>
>> If sizeof(struct page) is not power-of-2, there is no functional
>> changes.
>>
>
> Forgot to mention, I believe I stated in the previous version that this
> mechanism only applies when CONFIG_SPARSEMEM_VMEMMAP is configured.
> Therefore, you need to wrap the entire mechanism within CONFIG_SPARSEMEM_VMEMMAP.
> For other configurations, it's difficult to guarantee alignment to a very
> large size (for example, in the case of CONFIG_SPARSEMEM && !CONFIG_SPARSEMEM_VMEMMAP,
> vmemmap allocation uses kvmalloc, which only guarantees PAGE_SIZE alignment
> for the returned address).

I found that we can call kvmalloc_node_align inside populate_section_memmap (for
memory hotplug case), so that we can specify the alignment parameter as the
input size. Then, this mechanism can applied for CONFIG_SPARSEMEM &&
!CONFIG_SPARSEMEM_VMEMMAP.

For CONFIG_FLATMEM, we also need similar approach to specify the correct alignment
in alloc_node_mem_map().

Thanks.