Re: [PATCH v2 2/3] vmalloc: Optimize vfree
From: Vlastimil Babka (SUSE)
Date: Fri Mar 20 2026 - 10:34:50 EST
On 3/20/26 09:39, David Hildenbrand (Arm) wrote:
> On 3/16/26 16:49, Vlastimil Babka wrote:
>>> mm/vmalloc.c | 34 +++++++++++++++++++++++++---------
>>> 1 file changed, 25 insertions(+), 9 deletions(-)
>>>
>>> diff --git a/mm/vmalloc.c b/mm/vmalloc.c
>>> index c607307c657a6..8b935395fb068 100644
>>> --- a/mm/vmalloc.c
>>> +++ b/mm/vmalloc.c
>>> @@ -3459,18 +3459,34 @@ void vfree(const void *addr)
>>>
>>> if (unlikely(vm->flags & VM_FLUSH_RESET_PERMS))
>>> vm_reset_perms(vm);
>>> - for (i = 0; i < vm->nr_pages; i++) {
>>> - struct page *page = vm->pages[i];
>>> +
>>> + if (vm->nr_pages) {
>>> + bool account = !(vm->flags & VM_MAP_PUT_PAGES);
>>> + unsigned long start_pfn, pfn;
>>> + struct page *page = vm->pages[0];
>>> + int nr = 1;
>>>
>>> BUG_ON(!page);
>>> - /*
>>> - * High-order allocs for huge vmallocs are split, so
>>> - * can be freed as an array of order-0 allocations
>>> - */
>>> - if (!(vm->flags & VM_MAP_PUT_PAGES))
>>> + start_pfn = page_to_pfn(page);
>>> + if (account)
>>> mod_lruvec_page_state(page, NR_VMALLOC, -1);
>>> - __free_page(page);
>>> - cond_resched();
>>> +
>>> + for (i = 1; i < vm->nr_pages; i++) {
>>> + page = vm->pages[i];
>>> + BUG_ON(!page);
>>
>> We shouldn't be adding BUG_ON()'s. Rather demote also the pre-existing one
>> to VM_WARN_ON_ONCE() and skip gracefully.
>>
>>> + if (account)
>>> + mod_lruvec_page_state(page, NR_VMALLOC, -1);
>>
>> I think we should be able to batch this too to use "nr"?
>
> Are we sure that pages cannot cross nodes etc? It could happen that we
> have a contig range that spans zones/nodes/etc ...
Hmm single order-3 allocation can't but we could be unlucky and get the last
order-3 from zone X and first order-3 from adjacent zone Y.
In that case the loop would need to also check same zone/node.
> Anyhow, should we try to decouple both things, providing a
> core-mm function to do the page freeing?
>
> We do have something similar, optimized unpinning of large folios,
> in unpin_user_pages_dirty_lock(). This here is a bit different.
>
>
> So what I am thinking about for this code here to do:
>
> if (!(vm->flags & VM_MAP_PUT_PAGES)) {
> for (i = 0; i < vm->nr_pages; i++)
> mod_lruvec_page_state(vm->pages[i], NR_VMALLOC, -1);
> }
> free_pages_bulk(vm->pages, vm->nr_pages);
>
>
> We could optimize the first loop to do batching where possible as well.
>
>
> free_pages_bulk() would match alloc_pages_bulk()
>
> void free_pages_bulk(struct page **page_array, unsigned long nr_pages)
>
> Internally we'd do the contig handling.
>
> Was that already discussed?
AFAIU some of Zi's replies hinted at this direction. It would make sense, yeah.