Re: [PATCH v5 0/3] mm/vmalloc: free unused pages on vrealloc() shrink

From: Shivam Kalra

Date: Sat Mar 21 2026 - 04:17:16 EST


On 17/03/26 13:47, Shivam Kalra via B4 Relay wrote:
> This series implements the TODO in vrealloc() to unmap and free unused
> pages when shrinking across a page boundary.
>
> Problem:
> When vrealloc() shrinks an allocation, it updates bookkeeping
> (requested_size, KASAN shadow) but does not free the underlying physical
> pages. This wastes memory for the lifetime of the allocation.
>
> Solution:
> - Patch 1: Extracts a vm_area_free_pages(vm, start, end) helper from
> vfree() that frees a range of pages with memcg and nr_vmalloc_pages
> accounting. Freed page pointers are set to NULL to prevent stale
> references.
> - Patch 2: Uses the helper to free tail pages when vrealloc() shrinks
> across a page boundary. Skips huge page allocations (page_order > 0)
> since compound pages cannot be partially freed. Allocations with
> VM_FLUSH_RESET_PERMS are also skipped. Also fixes the grow-in-place
> path to check vm->nr_pages instead of get_vm_area_size(), which
> reflects the virtual reservation and does not change on shrink.
> - Patch 3: Adds a vrealloc test case to lib/test_vmalloc that exercises
> grow-realloc, shrink-across-boundary, shrink-within-page, and
> grow-in-place paths with data integrity validation.
>
> The virtual address reservation is kept intact to preserve the range
> for potential future grow-in-place support.
> A concrete user is the Rust binder driver's KVVec::shrink_to [1], which
> performs explicit vrealloc() shrinks for memory reclamation.
>
> Tested:
> - KASAN KUnit (vmalloc_oob passes)
> - lib/test_vmalloc stress tests (3/3, 1M iterations each)
> - checkpatch, sparse, W=1, allmodconfig, coccicheck clean
>
> [1] https://lore.kernel.org/all/20260216-binder-shrink-vec-v3-v6-0-ece8e8593e53@xxxxxxxxxxx/
>
> Signed-off-by: Shivam Kalra <shivamkalra98@xxxxxxxxxxx>
> ---
> Changes in v5:
> - Skip vrealloc shrink for VM_FLUSH_RESET_PERMS (Uladzislau Rezki)
> - Link to v4: https://lore.kernel.org/r/20260314-vmalloc-shrink-v4-0-c1e2e0bb5455@xxxxxxxxxxx
>
> Changes in v4:
> - Rename vmalloc_free_pages() to vm_area_free_pages() to align with
> vm_area_alloc_pages() (Uladzislau Rezki)
> - NULL out freed vm->pages[] entries to prevent stale pointers (Alice Ryhl)
> - Remove redundant if (vm->nr_pages) guard in vfree() (Uladzislau Rezki)
> - Add vrealloc test case to lib/test_vmalloc (new patch 3/3)
> - Link to v3: https://lore.kernel.org/r/20260309-vmalloc-shrink-v3-0-5590fd8de2eb@xxxxxxxxxxx
>
> Changes in v3:
> - Restore the comment.
> - Rebase to the latest mm-new
> - Link to v2: https://lore.kernel.org/r/20260304-vmalloc-shrink-v2-0-28c291d60100@xxxxxxxxxxx
>
> Changes in v2:
> - Updated the base-commit to mm-new
> - Fix conflicts after rebase
> - Ran `clang-format` on the changes made
> - Use a single `kasan_vrealloc` (Alice Ryhl)
> - Link to v1: https://lore.kernel.org/r/20260302-vmalloc-shrink-v1-0-46deff465b7e@xxxxxxxxxxx
>
> ---
> Shivam Kalra (3):
> mm/vmalloc: extract vm_area_free_pages() helper from vfree()
> mm/vmalloc: free unused pages on vrealloc() shrink
> lib/test_vmalloc: add vrealloc test case
>
> lib/test_vmalloc.c | 52 ++++++++++++++++++++++++++++++++++++++++++
> mm/vmalloc.c | 67 ++++++++++++++++++++++++++++++++++++++----------------
> 2 files changed, 100 insertions(+), 19 deletions(-)
> ---
> base-commit: 7d47a508dfdc335c107fb00b4d9ef46488281a52
> change-id: 20260302-vmalloc-shrink-04b2fa688a14
>
> Best regards,
Hi everyone,

Following up on the concerns raised regarding `get_vm_area_size()` versus
`vm->nr_pages << PAGE_SHIFT`, Andrew kindly ran the patchset through an
AI review which flagged several concrete issues.

I've used those results to audit the code and figure out exactly what
breaks when we shrink allocations while preserving the virtual area size.
Based on that research, here is what I am planning to include in the v6
series to address these edge cases:

1. Fixing the VM_USERMAP crash
Alice correctly pointed out that `remap_vmalloc_range_partial()` relies
on `get_vm_area_size()` to validate the mapping size. If we free tail
pages but keep `vm->size` unchanged, mapping the full original size
would cause a NULL pointer dereference in `vm_insert_page()`.
Plan: I'll update the shrink path to explicitly bail out if `VM_USERMAP`
is set, ensuring safety for these mappings.

2. Fixing the Kmemleak scanner panic
Kmemleak tracks the original allocation size and scans it periodically.
If we unmap and free tail pages without notifying kmemleak, its scanner
will fault on the unmapped virtual addresses.
Plan: I'll add a call to `kmemleak_free_part()` during the shrink to
keep its tracked object size updated.

3. Fixing a /proc/vmallocinfo race condition
`show_numa_info()` iterates over `v->nr_pages`. During a shrink,
modifying `nr_pages` and NULL-ing out the page pointers concurrently
could cause a reader to dereference a NULL page pointer.
Plan: I'll update the reader to use `READ_ONCE(v->nr_pages)`, and have
the shrink path do a `WRITE_ONCE(vm->nr_pages, new_nr_pages)` before
freeing the pages. This guarantees that concurrent readers either see
the old count with valid pages or the new, smaller count.

4. Fixing a stale data leak on grow
A vrealloc grow with `__GFP_ZERO` could leak previously discarded data
if an intermediate shrink happened without `__GFP_ZERO` (which skips
zeroing the freed region).
Plan: I will add mandatory zeroing in the grow-in-place path for
`want_init_on_alloc()` to clear any newly exposed bytes.

Thanks again to Alice and Danilo for prompting the closer look, and to
Andrew for providing the review. I should have v6 ready for review soon.

Best regards,
Shivam