Re: [PATCHv2 06/14] mm: Rework compound_head() for power-of-2 sizeof(struct page)

From: Kiryl Shutsemau

Date: Mon Dec 22 2025 - 09:49:49 EST


On Mon, Dec 22, 2025 at 05:45:16PM +0800, Muchun Song wrote:
>
>
> > On Dec 22, 2025, at 15:57, Muchun Song <muchun.song@xxxxxxxxx> wrote:
> >
> >
> >
> >> On Dec 18, 2025, at 23:09, Kiryl Shutsemau <kas@xxxxxxxxxx> wrote:
> >>
> >> For tail pages, the kernel uses the 'compound_info' field to get to the
> >> head page. The bit 0 of the field indicates whether the page is a
> >> tail page, and if set, the remaining bits represent a pointer to the
> >> head page.
> >>
> >> For cases when size of struct page is power-of-2, change the encoding of
> >> compound_info to store a mask that can be applied to the virtual address
> >> of the tail page in order to access the head page. It is possible
> >> because struct page of the head page is naturally aligned with regards
> >> to order of the page.
> >>
> >> The significant impact of this modification is that all tail pages of
> >> the same order will now have identical 'compound_info', regardless of
> >> the compound page they are associated with. This paves the way for
> >> eliminating fake heads.
> >>
> >> The HugeTLB Vmemmap Optimization (HVO) creates fake heads and it is only
> >> applied when the sizeof(struct page) is power-of-2. Having identical
> >> tail pages allows the same page to be mapped into the vmemmap of all
> >> pages, maintaining memory savings without fake heads.
> >>
> >> If sizeof(struct page) is not power-of-2, there is no functional
> >> changes.
> >>
> >
> > Forgot to mention, I believe I stated in the previous version that this
> > mechanism only applies when CONFIG_SPARSEMEM_VMEMMAP is configured.
> > Therefore, you need to wrap the entire mechanism within CONFIG_SPARSEMEM_VMEMMAP.
> > For other configurations, it's difficult to guarantee alignment to a very
> > large size (for example, in the case of CONFIG_SPARSEMEM && !CONFIG_SPARSEMEM_VMEMMAP,
> > vmemmap allocation uses kvmalloc, which only guarantees PAGE_SIZE alignment
> > for the returned address).
>
> I found that we can call kvmalloc_node_align inside populate_section_memmap (for
> memory hotplug case), so that we can specify the alignment parameter as the
> input size. Then, this mechanism can applied for CONFIG_SPARSEMEM &&
> !CONFIG_SPARSEMEM_VMEMMAP.
>
> For CONFIG_FLATMEM, we also need similar approach to specify the correct alignment
> in alloc_node_mem_map().

I guess I will need to invest some time to make a test setup with
!VMEMMAP and FLATMEM.

--
Kiryl Shutsemau / Kirill A. Shutemov