Re: [PATCH v2 2/3] vmalloc: Optimize vfree
From: Muhammad Usama Anjum
Date: Mon Mar 23 2026 - 07:29:29 EST
On 20/03/2026 2:33 pm, Vlastimil Babka (SUSE) wrote:
> On 3/20/26 09:39, David Hildenbrand (Arm) wrote:
>> On 3/16/26 16:49, Vlastimil Babka wrote:
>>>> mm/vmalloc.c | 34 +++++++++++++++++++++++++---------
>>>> 1 file changed, 25 insertions(+), 9 deletions(-)
>>>>
>>>> diff --git a/mm/vmalloc.c b/mm/vmalloc.c
>>>> index c607307c657a6..8b935395fb068 100644
>>>> --- a/mm/vmalloc.c
>>>> +++ b/mm/vmalloc.c
>>>> @@ -3459,18 +3459,34 @@ void vfree(const void *addr)
>>>>
>>>> if (unlikely(vm->flags & VM_FLUSH_RESET_PERMS))
>>>> vm_reset_perms(vm);
>>>> - for (i = 0; i < vm->nr_pages; i++) {
>>>> - struct page *page = vm->pages[i];
>>>> +
>>>> + if (vm->nr_pages) {
>>>> + bool account = !(vm->flags & VM_MAP_PUT_PAGES);
>>>> + unsigned long start_pfn, pfn;
>>>> + struct page *page = vm->pages[0];
>>>> + int nr = 1;
>>>>
>>>> BUG_ON(!page);
>>>> - /*
>>>> - * High-order allocs for huge vmallocs are split, so
>>>> - * can be freed as an array of order-0 allocations
>>>> - */
>>>> - if (!(vm->flags & VM_MAP_PUT_PAGES))
>>>> + start_pfn = page_to_pfn(page);
>>>> + if (account)
>>>> mod_lruvec_page_state(page, NR_VMALLOC, -1);
>>>> - __free_page(page);
>>>> - cond_resched();
>>>> +
>>>> + for (i = 1; i < vm->nr_pages; i++) {
>>>> + page = vm->pages[i];
>>>> + BUG_ON(!page);
>>>
>>> We shouldn't be adding BUG_ON()'s. Rather demote also the pre-existing one
>>> to VM_WARN_ON_ONCE() and skip gracefully.
>>>
>>>> + if (account)
>>>> + mod_lruvec_page_state(page, NR_VMALLOC, -1);
>>>
>>> I think we should be able to batch this too to use "nr"?
>>
>> Are we sure that pages cannot cross nodes etc? It could happen that we
>> have a contig range that spans zones/nodes/etc ...
>
> Hmm single order-3 allocation can't but we could be unlucky and get the last
> order-3 from zone X and first order-3 from adjacent zone Y.
> In that case the loop would need to also check same zone/node.
>
>> Anyhow, should we try to decouple both things, providing a
>> core-mm function to do the page freeing?
>>
>> We do have something similar, optimized unpinning of large folios,
>> in unpin_user_pages_dirty_lock(). This here is a bit different.
>>
>>
>> So what I am thinking about for this code here to do:
>>
>> if (!(vm->flags & VM_MAP_PUT_PAGES)) {
>> for (i = 0; i < vm->nr_pages; i++)
>> mod_lruvec_page_state(vm->pages[i], NR_VMALLOC, -1);
>> }
>> free_pages_bulk(vm->pages, vm->nr_pages);
>>
>>
>> We could optimize the first loop to do batching where possible as well.
>>
>>
>> free_pages_bulk() would match alloc_pages_bulk()
>>
>> void free_pages_bulk(struct page **page_array, unsigned long nr_pages)
>>
>> Internally we'd do the contig handling.
>>
>> Was that already discussed?
>
> AFAIU some of Zi's replies hinted at this direction. It would make sense, yeah.
I'm updating and will send next version.
Thanks,
Usama