[RFC PATCH 0/8] mm/vmalloc: Speed up ioremap, vmalloc and vmap with contiguous memory

From: Barry Song (Xiaomi)

Date: Tue Apr 07 2026 - 22:51:35 EST


This patchset accelerates ioremap, vmalloc, and vmap when the memory
is physically fully or partially contiguous. Two techniques are used:

1. Avoid page table zigzag when setting PTEs/PMDs for multiple memory
segments
2. Use batched mappings wherever possible in both vmalloc and ARM64
layers

Patches 1–2 extend ARM64 vmalloc CONT-PTE mapping to support multiple
CONT-PTE regions instead of just one.

Patches 3–4 extend vmap_small_pages_range_noflush() to support page
shifts other than PAGE_SHIFT. This allows mapping multiple memory
segments for vmalloc() without zigzagging page tables.

Patches 5–8 add huge vmap support for contiguous pages. This not only
improves performance but also enables PMD or CONT-PTE mapping for the
vmapped area, reducing TLB pressure.

Many thanks to Xueyuan Chen for his substantial testing efforts
on RK3588 boards.

On the RK3588 8-core ARM64 SoC, with tasks pinned to CPU2 and
the performance CPUfreq policy enabled, Xueyuan’s tests report:

* ioremap(1 MB): 1.2× faster
* vmalloc(1 MB) mapping time (excluding allocation) with
VM_ALLOW_HUGE_VMAP: 1.5× faster
* vmap(): 5.6× faster when memory includes some order-8 pages,
with no regression observed for order-0 pages

Barry Song (Xiaomi) (8):
arm64/hugetlb: Extend batching of multiple CONT_PTE in a single PTE
setup
arm64/vmalloc: Allow arch_vmap_pte_range_map_size to batch multiple
CONT_PTE
mm/vmalloc: Extend vmap_small_pages_range_noflush() to support larger
page_shift sizes
mm/vmalloc: Eliminate page table zigzag for huge vmalloc mappings
mm/vmalloc: map contiguous pages in batches for vmap() if possible
mm/vmalloc: align vm_area so vmap() can batch mappings
mm/vmalloc: Coalesce same page_shift mappings in vmap to avoid pgtable
zigzag
mm/vmalloc: Stop scanning for compound pages after encountering small
pages in vmap

arch/arm64/include/asm/vmalloc.h | 6 +-
arch/arm64/mm/hugetlbpage.c | 10 ++
mm/vmalloc.c | 178 +++++++++++++++++++++++++------
3 files changed, 161 insertions(+), 33 deletions(-)

--
2.39.3 (Apple Git-146)