Re: [PATCH v6 1/5] mm: rmap: support batched checks of the references for large folios

From: David Hildenbrand (Arm)

Date: Mon Feb 09 2026 - 10:31:23 EST


On 2/9/26 15:07, Baolin Wang wrote:
Currently, folio_referenced_one() always checks the young flag for each PTE
sequentially, which is inefficient for large folios. This inefficiency is
especially noticeable when reclaiming clean file-backed large folios, where
folio_referenced() is observed as a significant performance hotspot.

Moreover, on Arm64 architecture, which supports contiguous PTEs, there is already
an optimization to clear the young flags for PTEs within a contiguous range.
However, this is not sufficient. We can extend this to perform batched operations
for the entire large folio (which might exceed the contiguous range: CONT_PTE_SIZE).

Introduce a new API: clear_flush_young_ptes() to facilitate batched checking
of the young flags and flushing TLB entries, thereby improving performance
during large folio reclamation. And it will be overridden by the architecture
that implements a more efficient batch operation in the following patches.

While we are at it, rename ptep_clear_flush_young_notify() to
clear_flush_young_ptes_notify() to indicate that this is a batch operation.

Reviewed-by: Harry Yoo <harry.yoo@xxxxxxxxxx>
Reviewed-by: Ryan Roberts <ryan.roberts@xxxxxxx>
Signed-off-by: Baolin Wang <baolin.wang@xxxxxxxxxxxxxxxxx>
---

Thanks!

Acked-by: David Hildenbrand (Arm) <david@xxxxxxxxxx>

--
Cheers,

David