Re: [PATCH v3 9/9] mm/rmap: enable batch unmapping of anonymous folios
From: Dev Jain
Date: Tue May 12 2026 - 05:18:20 EST
On 11/05/26 1:46 pm, David Hildenbrand (Arm) wrote:
> On 5/6/26 11:45, Dev Jain wrote:
>> Enable batch clearing of ptes, and batch swap setting of ptes for anon
>> folio unmapping.
>>
>> Processing all ptes of a large folio in one go helps us batch across
>> atomics (add_mm_counter etc), barriers (in the function
>> __folio_try_share_anon_rmap), repeated calls to page_vma_mapped_walk(),
>> to name a few. In general, batching helps us to execute similar code
>> together, making the execution of the program more memory and
>> CPU friendly.
>>
>> On arm64-contpte, batching also helps us avoid redundant ptep_get() calls
>> and TLB flushes while breaking the contpte mapping.
>>
>> The handling of anon-exclusivity is very similar to commit cac1db8c3aad
>> ("mm: optimize mprotect() by PTE batching"). Since folio_unmap_pte_batch()
>> won't look at the bits of the underlying page, we need to process
>> sub-batches of ptes pointing to pages which are same w.r.t exclusivity,
>> and batch set only those ptes to swap ptes in one go. Hence export
>> page_anon_exclusive_sub_batch() to internal.h and reuse it.
>>
>> arch_unmap_one() is only defined for sparc64; I am not comfortable
>> regarding the nuances between retrieving the pfn from pte_pfn() or from
>> (paddr = pte_val(oldpte) & _PAGE_PADDR_4V).
>>
>> (And, pte_next_pfn() can't even be called from arch_unmap_one() because
>> that file does not include pgtable.h) So just disable the
>> "sparc64-anon-swapbacked" case for now.
>>
>> We need to take care of rmap accounting (folio_remove_rmap_ptes) and
>> reference accounting (folio_put_refs) when anon folio unmap succeeds.
>> In case we partially batch the large folio and fail, we need to correctly
>> do the accounting for pages which were successfully unmapped. So, put
>> this accounting code in __unmap_anon_folio() itself, instead of doing
>> some horrible goto jumping at the callsite of unmap_anon_folio().
>>
>> Add a comment at relevant places to say that we are on a device-exclusive
>> entry and not a present entry.
>>
>> If the batch length is less than the number of pages in the folio, then
>> we must skip over this batch.
>>
>> The page_vma_mapped_walk API ensures this - check_pte() will return true
>> only if any of [pvmw->pfn, pvmw->pfn + nr_pages) is mapped by the pte.
>> There is no pfn underlying a swap pte, so check_pte returns false and we
>> keep skipping until we hit a present pte, which is where we want to start
>> unmapping from next.
>>
>
> This patch is doing too much. Please separate the cleanups (e.g., moving stuff
> into helpers -- that likely should have a ttu_ prefix) from the real deal.
Okay.
>
>