Re: [REGRESSION] x86/hugetlb: AMD F15h VA alignment offset breaks MAP_HUGETLB alignment

From: Oscar Salvador (SUSE)

Date: Wed May 27 2026 - 11:56:46 EST


On Wed, May 27, 2026 at 04:36:43PM +0200, Karsten Desler wrote:
> Hi,
>
> I found a reproducible hugetlb regression on an AMD Family 15h system.
>
> On some boots, mmap(MAP_HUGETLB) returns a virtual address that is not aligned
> to the hugepage size. The mapping is nevertheless installed as a hugetlb VMA.
> When the process exits, the kernel later BUGs in __unmap_hugepage_range().
>
> 6.18.33 x86_64, AMD opteron 6238, 2M hugepages

Thanks Kartsten for reporting this.

Ooops, that would be me.

> Example bad mapping captured from /proc/$pid/maps:
>
> 7fc67f604000-7fc67f804000 rw-p 00000000 00:0f 12340 /anon_hugepage (deleted)
>
> The address has offset 0x4000 within a 2 MiB hugepage.
>
> smaps confirms it is really hugetlb:
>
> KernelPageSize: 2048 kB
> MMUPageSize: 2048 kB
> Private_Hugetlb: 2048 kB
> VmFlags: rd wr mr mw me de ht
>
> Minimal reproducer:
>
> echo 1000 > /proc/sys/vm/nr_hugepages
>
> mmap(NULL, 1229824, PROT_READ|PROT_WRITE,
> MAP_PRIVATE|MAP_ANONYMOUS|MAP_POPULATE|MAP_HUGETLB, -1, 0)
>
> On bad boots this returns e.g.:
>
> mmap returned 0x7fc67f604000 aligned=no offset=16384
>
> and exiting the process triggers:
>
> Kernel BUG at __unmap_hugepage_range+0x5ef/0x640
> RIP: __unmap_hugepage_range+0x5ef/0x640
> Fixing recursive fault but reboot is needed!
>
> The following is AI work, sorry if that's total BS but at the very least,
> I can reproduce the kernelBUG and booting with
> align_va_addr=off
> works around the issue.
>
> This is boot-dependent. Some boots work, some fail. The reason appears
> to be the per-boot AMD F15h VA alignment offset.

I have to confess that I completely overlooked that scenario, so let me
apologyze.

> The old x86 hugetlb path in arch/x86/mm/hugetlbpage.c only set:
>
> info.align_mask = PAGE_MASK & ~huge_page_mask(h);
>
> It did not add the AMD F15h align offset.
>
> After the v6.13-rc1 hugetlb mmap rework, hugetlb mappings go through
> arch_get_unmapped_area*(), and x86 currently does:
>
> if (filp) {
> info.align_mask = get_align_mask(filp);
> info.align_offset += get_align_bits();
> }

Ok, I see.

>
> For hugetlb, get_align_mask(filp) correctly returns the hugepage alignment
> mask, but get_align_bits() can still return the AMD F15h per-boot offset,
> e.g. 0x4000. That produces a non-hugepage-aligned hugetlb VMA.
>
> Likely introduced by the v6.13-rc1 series:
>
> 1317a5e7f7b1 arch/x86: teach arch_get_unmapped_area_vmflags to handle hugetlb mappings
> 7bd3f1e1a9ae mm: make hugetlb mappings go through mm_get_unmapped_area_vmflags
> cc92882ee218 mm: drop hugetlb_get_unmapped_area{_*} functions

Yes, that was part of a refactoring I did some time ago.

I will fix it up later today/early tomorrow.

Would you be available for a quick test once I have the patch?



--
Oscar Salvador
SUSE Labs