Re: [PATCH v3 0/9] Optimize anonymous large folio unmapping

From: Dev Jain

Date: Mon May 11 2026 - 02:23:09 EST

On 09/05/26 5:08 am, Andrew Morton wrote:
> On Wed, 6 May 2026 15:14:55 +0530 Dev Jain <dev.jain@xxxxxxx> wrote:
>
>> Speed up unmapping of anonymous large folios by clearing the ptes, and
>> setting swap ptes, in one go.
>>
>> ...
>>
>> Performance as measured on a Linux VM on Apple M3 (arm64):
>>
>> Vanilla - Mean: 37401913 ns, std dev: 12%
>> Patched - Mean: 17420282 ns, std dev: 11%
>>
>> No regression observed on 4K folios.
>>
>> Performance as measured on bare metal x86:
>>
>> Vanilla - mean: 54986286 ns, std dev: 1.5%
>> Patched - mean: 51930795 ns, std dev: 3%
>
> That looks nice.
>
> I'll pass at this time, wait for reviewer input. Most reviewers are
> jetlagged and exhausted, so a resend might be needed ;)
>
> Saskiko said a few things:
> https://sashiko.dev/#/patchset/20260506094504.2588857-1-dev.jain@xxxxxxx

Patch 2:

"In the original code, failing hugetlb_vma_trylock_write() triggered a
goto walk_abort, leaving ret set to true."

That is wrong.

Patch 9:

"Since __HAVE_ARCH_UNMAP_ONE is typically defined without a value on sparc64,
__is_defined() will evaluate to 0 because it is primarily designed for Kconfig
symbols that explicitly evaluate to 1."

Which is again wrong?

Patch 9:

"What happens to the remaining pages in the batch? Since get_and_clear_ptes()
cleared all of them upfront, and the loop aborts early without restoring them,
it appears the remaining PTEs are left cleared in the page tables and their
references are not released"

Yes this is valid. I did see it on the v2 Sashiko review but misread it : )

When unmap fails for a sub-batch, I need to restore all the cleared ptes,
not only those of the sub-batch.

This should work:

diff --git a/mm/rmap.c b/mm/rmap.c
index fc953f36d4527..e54c15a82c504 100644
--- a/mm/rmap.c
+++ b/mm/rmap.c
@@ -2023,10 +2023,8 @@ static inline bool __unmap_anon_folio_range(struct vm_area_struct *vma, struct f
swp_entry_t entry = page_swap_entry(subpage);
struct mm_struct *mm = vma->vm_mm;

- if (folio_dup_swap_pages(folio, subpage, nr_pages) < 0) {
- set_ptes(mm, address, ptep, pteval, nr_pages);
+ if (folio_dup_swap_pages(folio, subpage, nr_pages) < 0)
return false;
- }

/*
* arch_unmap_one() is expected to be a NOP on
@@ -2036,16 +2034,13 @@ static inline bool __unmap_anon_folio_range(struct vm_area_struct *vma, struct f
if (arch_unmap_one(mm, vma, address, pteval) < 0) {
VM_WARN_ON(nr_pages != 1);
folio_put_swap_pages(folio, subpage, nr_pages);
- set_pte_at(mm, address, ptep, pteval);
return false;
}

/* See folio_try_share_anon_rmap(): clear PTE first. */
- if (anon_exclusive && folio_try_share_anon_rmap_ptes(folio, subpage, nr_pages)) {
+ if (anon_exclusive && folio_try_share_anon_rmap_ptes(folio, subpage, nr_pages))
folio_put_swap_pages(folio, subpage, nr_pages);
- set_ptes(mm, address, ptep, pteval, nr_pages);
return false;
- }

if (list_empty(&mm->mmlist)) {
spin_lock(&mmlist_lock);
@@ -2075,8 +2070,10 @@ static inline bool unmap_anon_folio_range(struct vm_area_struct *vma, struct fol
first_page, expected_anon_exclusive);
ret = __unmap_anon_folio_range(vma, folio, first_page + sub_batch_idx,
address, ptep, pteval, len, expected_anon_exclusive);
- if (!ret)
+ if (!ret) {
+ set_ptes(vma->vm_mm, address, ptep, pteval, nr_pages);
return ret;
+ }

nr_pages -= len;
if (!nr_pages)

>
>