Re: [PATCH v5 2/3] mm/vmalloc: free unused pages on vrealloc() shrink
From: Shivam Kalra
Date: Tue Mar 17 2026 - 12:05:12 EST
On 17/03/26 20:15, Danilo Krummrich wrote:
> On Tue Mar 17, 2026 at 3:39 PM CET, Alice Ryhl wrote:
>> On Tue, Mar 17, 2026 at 01:47:34PM +0530, Shivam Kalra wrote:
>>> When vrealloc() shrinks an allocation and the new size crosses a page
>>> boundary, unmap and free the tail pages that are no longer needed. This
>>> reclaims physical memory that was previously wasted for the lifetime
>>> of the allocation.
>>>
>>> The heuristic is simple: always free when at least one full page becomes
>>> unused. Huge page allocations (page_order > 0) are skipped, as partial
>>> freeing would require splitting. Allocations with VM_FLUSH_RESET_PERMS
>>> are also skipped, as their direct-map permissions must be reset before
>>> pages are returned to the page allocator, which is handled by
>>> vm_reset_perms() during vfree().
>>>
>>> The virtual address reservation (vm->size / vmap_area) is intentionally
>>> kept unchanged, preserving the address for potential future grow-in-place
>>> support.
>>>
>>> Fix the grow-in-place check to compare against vm->nr_pages rather than
>>> get_vm_area_size(), since the latter reflects the virtual reservation
>>> which does not shrink. Without this fix, a grow after shrink would
>>> access freed pages.
>>>
>>> Signed-off-by: Shivam Kalra <shivamkalra98@xxxxxxxxxxx>
>
> Feel free to add
>
> Suggested-by: Danilo Krummrich <dakr@xxxxxxxxxx>
>
>>> ---
>>> mm/vmalloc.c | 20 +++++++++++++++-----
>>> 1 file changed, 15 insertions(+), 5 deletions(-)
>>>
>>> diff --git a/mm/vmalloc.c b/mm/vmalloc.c
>>> index b29bf58c0e3f..f3820c6712c1 100644
>>> --- a/mm/vmalloc.c
>>> +++ b/mm/vmalloc.c
>>> @@ -4345,14 +4345,24 @@ void *vrealloc_node_align_noprof(const void *p, size_t size, unsigned long align
>>> goto need_realloc;
>>> }
>>>
>>> - /*
>>> - * TODO: Shrink the vm_area, i.e. unmap and free unused pages. What
>>> - * would be a good heuristic for when to shrink the vm_area?
>>> - */
>>> if (size <= old_size) {
>>> + unsigned int new_nr_pages = PAGE_ALIGN(size) >> PAGE_SHIFT;
>>> +
>>> /* Zero out "freed" memory, potentially for future realloc. */
>>> if (want_init_on_free() || want_init_on_alloc(flags))
>>> memset((void *)p + size, 0, old_size - size);
>>> +
>>> + /* Free tail pages when shrink crosses a page boundary. */
>>> + if (new_nr_pages < vm->nr_pages && !vm_area_page_order(vm) &&
>>> + !(vm->flags & VM_FLUSH_RESET_PERMS)) {
>>> + unsigned long addr = (unsigned long)p;
>>> +
>>> + vunmap_range(addr + (new_nr_pages << PAGE_SHIFT),
>>> + addr + (vm->nr_pages << PAGE_SHIFT));
>>> +
>>> + vm_area_free_pages(vm, new_nr_pages, vm->nr_pages);
>>> + vm->nr_pages = new_nr_pages;
>>> + }
>>> vm->requested_size = size;
>>> kasan_vrealloc(p, old_size, size);
>>> return (void *)p;
>>> @@ -4361,7 +4371,7 @@ void *vrealloc_node_align_noprof(const void *p, size_t size, unsigned long align
>>> /*
>>> * We already have the bytes available in the allocation; use them.
>>> */
>>> - if (size <= alloced_size) {
>>> + if (size <= (size_t)vm->nr_pages << PAGE_SHIFT) {
>>> /*
>>> * No need to zero memory here, as unused memory will have
>>> * already been zeroed at initial allocation time or during
>>
>> Hmm. So what happened here is that it has previously always been the
>> case that get_vm_area_size(area) == vm->nr_pages << PAGE_SHIFT, so these
>> constants were interchangable. But now that is no longer the case.
>>
>> For example, 'remap_vmalloc_range_partial' compares the vm area size
>> with the range being mapped, and then proceeds to look up the pages and
>> map them. But now those pages may be missing.
>>
>> I can't really tell if there are other places in this file that need to
>> be updated too.
>
> This may well be possible. I remember that when I added vrealloc() and looked
> into growing and shrinking, I concluded that it might need a bit of rework in
> terms of tracking the sizes of the different layers. Unfortunately, I don't
> remember the details anymore, but I'm quite sure there were some subtleties
> along the lines of what Alice points out, so I recommend to double check.
I will leave an update if I find some issue.