Re: [PATCH v3 2/2] mm/vmalloc: free unused pages on vrealloc() shrink

From: Shivam Kalra

Date: Sat Mar 14 2026 - 03:02:11 EST


On 12/03/26 13:29, Alice Ryhl wrote:
> On Mon, Mar 09, 2026 at 05:25:46PM +0530, Shivam Kalra via B4 Relay wrote:
>> From: Shivam Kalra <shivamkalra98@xxxxxxxxxxx>
>>
>> When vrealloc() shrinks an allocation and the new size crosses a page
>> boundary, unmap and free the tail pages that are no longer needed. This
>> reclaims physical memory that was previously wasted for the lifetime
>> of the allocation.
>>
>> The heuristic is simple: always free when at least one full page becomes
>> unused. Huge page allocations (page_order > 0) are skipped, as partial
>> freeing would require splitting.
>>
>> The virtual address reservation (vm->size / vmap_area) is intentionally
>> kept unchanged, preserving the address for potential future grow-in-place
>> support.
>>
>> Fix the grow-in-place check to compare against vm->nr_pages rather than
>> get_vm_area_size(), since the latter reflects the virtual reservation
>> which does not shrink. Without this fix, a grow after shrink would
>> access freed pages.
>>
>> Signed-off-by: Shivam Kalra <shivamkalra98@xxxxxxxxxxx>
>> ---
>> mm/vmalloc.c | 19 ++++++++++++++-----
>> 1 file changed, 14 insertions(+), 5 deletions(-)
>>
>> diff --git a/mm/vmalloc.c b/mm/vmalloc.c
>> index 42ae68450a90..114e0bd1030e 100644
>> --- a/mm/vmalloc.c
>> +++ b/mm/vmalloc.c
>> @@ -4344,14 +4344,23 @@ void *vrealloc_node_align_noprof(const void *p, size_t size, unsigned long align
>> goto need_realloc;
>> }
>>
>> - /*
>> - * TODO: Shrink the vm_area, i.e. unmap and free unused pages. What
>> - * would be a good heuristic for when to shrink the vm_area?
>> - */
>> if (size <= old_size) {
>> + unsigned int new_nr_pages = PAGE_ALIGN(size) >> PAGE_SHIFT;
>> +
>> /* Zero out "freed" memory, potentially for future realloc. */
>> if (want_init_on_free() || want_init_on_alloc(flags))
>> memset((void *)p + size, 0, old_size - size);
>> +
>> + /* Free tail pages when shrink crosses a page boundary. */
>> + if (new_nr_pages < vm->nr_pages && !vm_area_page_order(vm)) {
>> + unsigned long addr = (unsigned long)p;
>> +
>> + vunmap_range(addr + (new_nr_pages << PAGE_SHIFT),
>> + addr + (vm->nr_pages << PAGE_SHIFT));
>> +
>> + vmalloc_free_pages(vm, new_nr_pages, vm->nr_pages);
>
> This leaves the range vm->pages[new_nr_pages .. old_nr_pages] with
> non-NULL but freed page pointers. It seems less error prone to set those
> entries of vm->pages to NULL here.
>
> Note that it's not a problem for existing usage of vmalloc_free_pages(),
> because it is immediately followed by kvfree(vm->pages).
>
> Alice
>
>> + vm->nr_pages = new_nr_pages;
>> + }
>> vm->requested_size = size;
>> kasan_vrealloc(p, old_size, size);
>> return (void *)p;
>> @@ -4360,7 +4369,7 @@ void *vrealloc_node_align_noprof(const void *p, size_t size, unsigned long align
>> /*
>> * We already have the bytes available in the allocation; use them.
>> */
>> - if (size <= alloced_size) {
>> + if (size <= (size_t)vm->nr_pages << PAGE_SHIFT) {
>> /*
>> * No need to zero memory here, as unused memory will have
>> * already been zeroed at initial allocation time or during
>>
>> --
>> 2.43.0
>>
>>
Yeah, I agree. I will update this too.