mm/hugetlb: kernel fail to boot if total hugepages size is almost equal to system RAM

From: Sourabh Jain

Date: Thu Dec 18 2025 - 11:20:26 EST


Hello All,

I observed a kernel boot failure when the total hugepages size is almost
equal to the system RAM.

For example, a Power system with 255 GB RAM failed to boot with the
following kernel command-line arguments:

default_hugepagesz=2M hugepagesz=2M hugepages=128512

The failure occurred with the following logs:

  Booting a command list

OF stdout device is: /vdevice/vty@30000000
Preparing to boot Linux version 6.19.0-rc1+ (root@root) (gcc (GCC), GNU ld version 2.35.2-63.el9) #4 SMP Thu Dec 18 09:02:16 CST 2025
Detected machine type: 0000000000000101
command line: BOOT_IMAGE=(ieee1275//vdevice/v-scsi@30000065/disk@8100000000000000,msdos2)/vmlinuz-6.19.0-rc1+ root=/dev/mapper/r-root ro rd.lvm.lv=root/root rd.lvm.lv=root/swap biosdevname=0 loglevel=7 ignore_loglevel debug console=hvc0 earlycon=hvc0 earlyprintk crashkernel=4G default_hugepagesz=2M hugepagesz=2M hugepages=128512
Max number of cores passed to firmware: 256 (NR_CPUS = 2048)
Calling ibm,client-architecture-support... done
memory layout at init:
  memory_limit : 0000000000000000 (16 MB aligned)
  alloc_bottom : 0000000016050000
  alloc_top    : 0000000030000000
  alloc_top_hi : 0000000030000000
  rmo_top      : 0000000030000000
  ram_top      : 0000000030000000
instantiating rtas at 0x000000002ec50000... done
prom_hold_cpus: skipped
copying OF device tree...
Building dt strings...
Building dt structure...
Device tree strings 0x0000000016060000 -> 0x0000000016061844
Device tree struct  0x0000000016070000 -> 0x0000000016080000
Quiescing Open Firmware ...
Booting Linux via __start() @ 0x000000000a700000 ...
[    0.000000] printk: debug: ignoring loglevel setting.
[    0.000000] crashkernel reserved: 0x0000000018000000 - 0x0000000118000000 (4096 MB)
[    0.000000] radix-mmu: Page sizes from device-tree:
[    0.000000] radix-mmu: Page size shift = 12 AP=0x0
[    0.000000] radix-mmu: Page size shift = 16 AP=0x5
[    0.000000] radix-mmu: Page size shift = 21 AP=0x1
[    0.000000] radix-mmu: Page size shift = 30 AP=0x2
[    0.000000] Activating Kernel Userspace Access Prevention
[    0.000000] Activating Kernel Userspace Execution Prevention
[    0.000000] radix-mmu: Mapped 0x0000000000000000-0x0000000002800000 with 2.00 MiB pages (exec)
[    0.000000] radix-mmu: Mapped 0x0000000002800000-0x0000003ffde00000 with 2.00 MiB pages
[    0.000000] radix-mmu: Mapped 0x0000003ffde00000-0x0000003ffdff0000 with 64.0 KiB pages
[    0.000000] radix-mmu: Mapped 0x0000003fffff0000-0x0000004000000000 with 64.0 KiB pages
[    0.000000] radix-mmu: Mapped 0x0000003ffdff0000-0x0000003fffff0000 with 64.0 KiB pages
[    0.000000] lpar: Using radix MMU under hypervisor
[    0.000000] Linux version 6.19.0-rc1+ (root) (gcc (GCC) GNU ld version 2.35.2-63.el9) #4 SMP Thu Dec 18 09:02:16 CST 202
5
[    0.000000] OF: reserved mem: Reserved memory: No reserved-memory node in the DT
[    0.000000] Found initrd at 0xc00000000f800000:0xc000000016046afe
[    0.000000] Hardware name: hv:phyp pSeries
[    0.000000] printk: legacy bootconsole [udbg0] enabled
[    0.000000] Partition configured for 72 cpus.
[    0.000000] CPU maps initialized for 8 threads per core
[    0.000000]  (thread shift is 3)

<snip>

[    0.000000] Initmem setup node 28 as memoryless
[    0.000000] Initmem setup node 29 as memoryless
[    0.000000] Initmem setup node 30 as memoryless
[    0.000000] Initmem setup node 31 as memoryless
[    0.000000] percpu: Embedded 3 pages/cpu s126488 r0 d70120 u196608
[    0.000000] pcpu-alloc: s126488 r0 d70120 u196608 alloc=3*65536
[    0.000000] pcpu-alloc: [0] 00 [0] 01 [0] 02 [0] 03 [0] 04 [0] 05 [0] 06 [0] 07
[    0.000000] pcpu-alloc: [0] 08 [0] 09 [0] 10 [0] 11 [0] 12 [0] 13 [0] 14 [0] 15
[    0.000000] pcpu-alloc: [0] 16 [0] 17 [0] 18 [0] 19 [0] 20 [0] 21 [0] 22 [0] 23
[    0.000000] pcpu-alloc: [0] 24 [0] 25 [0] 26 [0] 27 [0] 28 [0] 29 [0] 30 [0] 31
[    0.000000] pcpu-alloc: [1] 32 [1] 33 [1] 34 [1] 35 [1] 36 [1] 37 [1] 38 [1] 39
[    0.000000] pcpu-alloc: [1] 40 [1] 41 [1] 42 [1] 43 [1] 44 [1] 45 [1] 46 [1] 47
[    0.000000] pcpu-alloc: [1] 48 [1] 49 [1] 50 [1] 51 [1] 52 [1] 53 [1] 54 [1] 55
[    0.000000] pcpu-alloc: [1] 56 [1] 57 [1] 58 [1] 59 [1] 60 [1] 61 [1] 62 [1] 63
[    0.000000] pcpu-alloc: [2] 64 [2] 65 [2] 66 [2] 67 [2] 68 [2] 69 [2] 70 [2] 71
[    0.000000] Kernel command line: BOOT_IMAGE=(ieee1275//vdevice/v-scsi@30000065/disk@8100000000000000,msdos2)/vmlinuz-6.19.0-rc1+ root=/dev/mapper/root ro rd.lvm.lv=root/root rd.lvm.lv=root/swap biosdevname=0 loglevel=7 ignore_loglevel debug console=hvc0 earlycon=hvc0 earlyprintk crashkernel=4G default_hugepagesz=2M hugepagesz=
2M hugepages=128512
[    0.000000] Unknown kernel command line parameters "earlyprintk biosdevname=0", will be passed to user space.
[    0.000000] random: crng init done
[    0.000000] printk: log buffer data + meta data: 1048576 + 3670016 = 4718592 bytes

<snip>

[    0.070655] thermal_sys: Registered thermal governor 'step_wise'
[    0.070709] cpuidle: using governor menu
[    0.070781] RTAS daemon started
[    0.070984] pstore: Using crash dump compression: deflate
[    0.070988] pstore: Registered nvram as persistent store backend
[    0.071386] EEH: pSeries platform initialized
[    0.071459] plpks: POWER LPAR Platform KeyStore is not supported or enabled
[    0.081865] kprobes: kprobe jump-optimization is enabled. All kprobes are optimized if possible.
[    2.828787] HugeTLB: allocation took 2740ms with hugepage_allocation_threads=18
[    2.828821] HugeTLB: allocating 128512 of page size 2.00 MiB failed.  Only allocated 128429 hugepages.
[    2.828852] HugeTLB: registered 2.00 MiB page size, pre-allocated 128429 pages
[    2.828855] HugeTLB: 0 KiB vmemmap can be freed for a 2.00 MiB page
[    2.828858] HugeTLB: registered 1.00 GiB page size, pre-allocated 0 pages
[    2.828862] HugeTLB: 0 KiB vmemmap can be freed for a 1.00 GiB page
[    2.831713] swapper/0: page allocation failure: order:5, mode:0xcc0(GFP_KERNEL), nodemask=(null),cpuset=/,mems_allowed=1-3
[    2.831732] CPU: 51 UID: 0 PID: 1 Comm: swapper/0 Not tainted 6.19.0-rc1+ #4 VOLUNTARY
[    2.831736] Hardware name: hv:phyp pSeries
[    2.831738] Call Trace:
[    2.831738] [c000001c801b77c0] [c00000000111ae6c] dump_stack_lvl+0x8c/0xf0 (unreliable)
[    2.831747] [c000001c801b77f0] [c00000000059a024] warn_alloc+0x12c/0x1d8
[    2.831752] [c000001c801b7890] [c00000000059a918] __alloc_pages_slowpath.constprop.0+0x848/0xa98
[    2.831755] [c000001c801b79d0] [c00000000059ae3c] __alloc_frozen_pages_noprof+0x2d4/0x3a8
[    2.831758] [c000001c801b7a50] [c0000000005eac64] alloc_pages_mpol+0x10c/0x1f4
[    2.831761] [c000001c801b7ab0] [c0000000005eadac] alloc_pages_noprof+0x60/0xe8
[    2.831763] [c000001c801b7ad0] [c0000000004d9978] mempool_alloc_pages+0x24/0x38
[    2.831767] [c000001c801b7af0] [c0000000004da4a0] mempool_init_node+0x138/0x1fc
[    2.831769] [c000001c801b7b40] [c00000000208844c] bio_integrity_initfn+0x40/0x70
[    2.831773] [c000001c801b7ba0] [c000000000010c44] do_one_initcall+0x60/0x36c
[    2.831776] [c000001c801b7c80] [c000000002006b2c] do_initcalls+0x12c/0x22c
[    2.831779] [c000001c801b7d30] [c000000002006f1c] kernel_init_freeable+0x23c/0x390
[    2.831781] [c000001c801b7de0] [c000000000011078] kernel_init+0x34/0x26c
[    2.831783] [c000001c801b7e50] [c00000000000dd3c] ret_from_kernel_user_thread+0x14/0x1c
[    2.831786] ---- interrupt: 0 at 0x0
[    2.831790] Mem-Info:
[    2.831871] active_anon:0 inactive_anon:0 isolated_anon:0
[    2.831871]  active_file:0 inactive_file:0 isolated_file:0
[    2.831871]  unevictable:0 dirty:0 writeback:0
[    2.831871]  slab_reclaimable:82 slab_unreclaimable:2106
[    2.831871]  mapped:0 shmem:0 pagetables:146
[    2.831871]  sec_pagetables:0 bounce:0
[    2.831871]  kernel_misc_reclaimable:0
[    2.831871]  free:944 free_pcp:3099 free_cma:0
[    2.831903] Node 1 active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB mapped:0kB dirty:0kB writeback:0kB shmem:0kB sh
mem_thp:0kB shmem_pmdmapped:0kB anon_thp:0kB kernel_stack:8000kB pagetables:4224kB sec_pagetables:0kB all_unreclaimable? no Balloon:0kB
[    2.831925] Node 2 active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB mapped:0kB dirty:0kB writeback:0kB shmem:0kB sh
mem_thp:0kB shmem_pmdmapped:0kB anon_thp:0kB kernel_stack:7968kB pagetables:4096kB sec_pagetables:0kB all_unreclaimable? no Balloon:0kB
[    2.831937] Node 3 active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB mapped:0kB dirty:0kB writeback:0kB shmem:0kB sh
mem_thp:0kB shmem_pmdmapped:0kB anon_thp:0kB kernel_stack:2272kB pagetables:1024kB sec_pagetables:0kB all_unreclaimable? no Balloon:0kB
[    2.831962] Node 1 Normal free:19520kB boost:0kB min:29440kB low:144448kB high:259456kB reserved_highatomic:0KB free_highatomic:0KB active_anon:0kB inactive_anon:0kB active_file:0kB inacti
ve_file:0kB unevictable:0kB writepending:0kB zspages:0kB present:119537664kB managed:115056960kB mlocked:0kB bounce:0kB free_pcp:84992kB local_pcp:2048kB free_cma:0kB
[    2.831991] lowmem_reserve[]: 0 0 0
[    2.831997] Node 2 Normal free:39424kB boost:2048kB min:32512kB low:151360kB high:270208kB reserved_highatomic:0KB free_highatomic:0KB active_anon:0kB inactive_anon:0kB active_file:0kB ina
ctive_file:0kB unevictable:0kB writepending:0kB zspages:0kB present:119013376kB managed:118885632kB mlocked:0kB bounce:0kB free_pcp:95552kB local_pcp:2816kB free_cma:0kB
[    2.832008] lowmem_reserve[]: 0 0 0
[    2.832011] Node 3 Normal free:1472kB boost:0kB min:7616kB low:37376kB high:67136kB reserved_highatomic:0KB free_highatomic:0KB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_f
ile:0kB unevictable:0kB writepending:0kB zspages:0kB present:29884416kB managed:29784448kB mlocked:0kB bounce:0kB free_pcp:17792kB local_pcp:0kB free_cma:0kB
[    2.832021] lowmem_reserve[]: 0 0 0
[    2.832025] Node 1 Normal: 3*64kB (UME) 3*128kB (ME) 4*256kB (UME) 3*512kB (UME) 4*1024kB (ME) 0*2048kB 0*4096kB 0*8192kB 0*16384kB = 7232kB
[    2.832037] Node 2 Normal: 1*64kB (U) 0*128kB 1*256kB (M) 0*512kB 2*1024kB (UM) 0*2048kB 0*4096kB 0*8192kB 0*16384kB = 2368kB
[    2.832052] Node 3 Normal: 1*64kB (E) 1*128kB (M) 3*256kB (UME) 1*512kB (U) 0*1024kB 0*2048kB 0*4096kB 0*8192kB 0*16384kB = 1472kB
[    2.832068] Node 1 hugepages_total=56043 hugepages_free=56043 hugepages_surp=0 hugepages_size=2048kB
[    2.832078] Node 1 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[    2.832086] Node 2 hugepages_total=57915 hugepages_free=57915 hugepages_surp=0 hugepages_size=2048kB
[    2.832093] Node 2 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[    2.832102] Node 3 hugepages_total=14471 hugepages_free=14471 hugepages_surp=0 hugepages_size=2048kB
[    2.832111] Node 3 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[    2.832119] 0 total pagecache pages
[    2.832122] 0 pages in swap cache
[    2.832127] Free swap  = 0kB
[    2.832130] Total swap = 0kB
[    2.832133] 4194304 pages RAM
[    2.832138] 0 pages HighMem/MovableOnly
[    2.832141] 73569 pages reserved
[    2.832143] 0 pages cma reserved
[    2.832146] 0 pages hwpoisoned
[    2.832153] Memory cgroup min protection 0kB -- low protection 0kB
[    2.832154] Kernel panic - not syncing: bio: can't create integrity buf pool
[    2.832160] CPU: 51 UID: 0 PID: 1 Comm: swapper/0 Not tainted 6.19.0-rc1+ #4 VOLUNTARY
[    2.832164] Hardware name: hv:phyp pSeries
[    2.832167] Call Trace:
[    2.832169] [c000001c801b7a50] [c00000000111aeb8] dump_stack_lvl+0xd8/0xf0 (unreliable)
[    2.832180] [c000001c801b7a80] [c00000000015d79c] vpanic+0x2c8/0x4b4
[    2.832189] [c000001c801b7b20] [c00000000015d9c8] nmi_panic+0x0/0xa0
[    2.832197] [c000001c801b7b40] [c000000002088478] bio_integrity_initfn+0x6c/0x70
[    2.832205] [c000001c801b7ba0] [c000000000010c44] do_one_initcall+0x60/0x36c
[    2.832213] [c000001c801b7c80] [c000000002006b2c] do_initcalls+0x12c/0x22c
[    2.832221] [c000001c801b7d30] [c000000002006f1c] kernel_init_freeable+0x23c/0x390
[    2.832229] [c000001c801b7de0] [c000000000011078] kernel_init+0x34/0x26c
[    2.832237] [c000001c801b7e50] [c00000000000dd3c] ret_from_kernel_user_thread+0x14/0x1c
[    2.832247] ---- interrupt: 0 at 0x0
[    2.834181] pstore: backend (nvram) writing error (-1)
[    2.835809] Rebooting in 10 seconds..

I agree that reserving hugepages equal to the system RAM is not very
practical. However, would it be a good idea to make the hugepage
memory allocator aware of the total system memory and leave some
memory for the kernel to boot?

Thanks,
Sourabh Jain