Re: [PATCH v2 4/6] mm: add a batched helper to clear the young flag for large folios
From: David Hildenbrand (Arm)
Date: Tue Mar 03 2026 - 03:51:15 EST
On 3/3/26 03:36, Baolin Wang wrote:
>
>
> On 3/2/26 5:07 PM, David Hildenbrand (Arm) wrote:
>> On 2/27/26 10:44, Baolin Wang wrote:
>>> Currently, MGLRU will call ptep_test_and_clear_young_notify() to
>>> check and
>>> clear the young flag for each PTE sequentially, which is inefficient for
>>> large folios reclamation.
>>>
>>> Moreover, on Arm64 architecture, which supports contiguous PTEs, the
>>> Arm64-
>>> specific ptep_test_and_clear_young() already implements an
>>> optimization to
>>> clear the young flags for PTEs within a contiguous range. However,
>>> this is not
>>> sufficient. Similar to the Arm64 specific clear_flush_young_ptes(),
>>> we can
>>> extend this to perform batched operations for the entire large folio
>>> (which
>>> might exceed the contiguous range: CONT_PTE_SIZE).
>>>
>>> Thus, we can introduce a new batched helper:
>>> test_and_clear_young_ptes() and
>>> its wrapper test_and_clear_young_ptes_notify() which are consistent
>>> with the
>>> existing functions, to perform batched checking of the young flags
>>> for large
>>> folios, which can help improve performance during large folio
>>> reclamation when
>>> MGLRU is enabled. And it will be overridden by the architecture that
>>> implements
>>> a more efficient batch operation in the following patches.
>>>
>>> Signed-off-by: Baolin Wang <baolin.wang@xxxxxxxxxxxxxxxxx>
>>> ---
>>> include/linux/pgtable.h | 38 ++++++++++++++++++++++++++++++++++++++
>>> mm/internal.h | 16 +++++++++++-----
>>> 2 files changed, 49 insertions(+), 5 deletions(-)
>>>
>>> diff --git a/include/linux/pgtable.h b/include/linux/pgtable.h
>>> index 776993d4567b..29bd9fd04e1e 100644
>>> --- a/include/linux/pgtable.h
>>> +++ b/include/linux/pgtable.h
>>> @@ -1103,6 +1103,44 @@ static inline int
>>> clear_flush_young_ptes(struct vm_area_struct *vma,
>>> }
>>> #endif
>>> +#ifndef test_and_clear_young_ptes
>>> +/**
>>> + * test_and_clear_young_ptes - Mark PTEs that map consecutive pages
>>> of the same
>>> + * folio as old
>>> + * @vma: The virtual memory area the pages are mapped into.
>>> + * @addr: Address the first page is mapped at.
>>> + * @ptep: Page table pointer for the first entry.
>>> + * @nr: Number of entries to clear access bit.
>>> + *
>>> + * May be overridden by the architecture; otherwise, implemented as
>>> a simple
>>> + * loop over ptep_test_and_clear_young().
>>> + *
>>> + * Note that PTE bits in the PTE range besides the PFN can differ.
>>> For example,
>>> + * some PTEs might be write-protected.
>>> + *
>>> + * Context: The caller holds the page table lock. The PTEs map
>>> consecutive
>>> + * pages that belong to the same folio. The PTEs are all in the
>>> same PMD.
>>> + *
>>> + * Returns: whether any PTE was young.
>>> + */
>>> +static inline int test_and_clear_young_ptes(struct vm_area_struct *vma,
>>> + unsigned long addr, pte_t *ptep,
>>> + unsigned int nr)
>>
>> Two tabs ...
>
> Ah, yes, not sure why I missed this one :(
>
>> What happened to using a boolen as return type and for "int young"?
> As I replied to you previously [1], I’d like to do this in a follow-up
> patchset that converts all functions that check the young flag. Does
> that sound OK to you?
>
> [1] https://lore.kernel.org/all/32c538ce-6af8-48a8-86fc-
> d26ee253af54@xxxxxxxxxxxxxxxxx/
Oh, missed that. Works for me.
--
Cheers,
David