[RFC PATCH] mm: rmap: Don't flush TLB after checking PTE young for page reference

From: Barry Song
Date: Fri Jun 17 2022 - 03:06:14 EST


From: Barry Song <v-songbaohua@xxxxxxxx>

Flushing TLB is usually quite expensive through hardware or software.
Both x86 and arm64 have tried to decrease the overhead by either
removing TLB flush and deferring it in ptep_clear_flush_young().

Removing the tlb flush gives about 20% ~ 30% swapout speedup on x86
according to commit b13b1d2d8692b ("x86/mm: In the PTE swapout page
reclaim case clear the accessed bit instead of flushing the TLB").

Similar result was also reported on arm64 by commit 3403e56b41c1("
arm64: mm: Don't wait for completion of TLB invalidation when page
aging").

While platforms like x86 and arm64 have noticed the problem and
resolved it by modifying ptep_clear_flush_young() to drop flush
by some means, most platforms are still doing TLB flush. In LRU,
it seems pointless to do TLB broadcast simply because of update
access bit.

Dropping flush in general LRU code seems be a proper way than
removing TLB flush in ptep_clear_flush_young() in all kind of
platforms as the name of the function is implying flush should
be included. Removing flush in a function who is named by flush
sounds vague. So this patch moves to ptep_clear_young_notify()
clearly without flush in LRU code.

This will help decrease the cost of TLB broadcast due to access
bit in LRU. The side effect is some minor lose in the accuracy
of PTE young data, but this has been proven to be not harmful
by those mainstream platforms like x86 and arm64.

Cc: Yu Zhao <yuzhao@xxxxxxxxxx>
Cc: Will Deacon <will@xxxxxxxxxx>
Cc: Alex Van Brunt <avanbrunt@xxxxxxxxxx>
Cc: Shaohua Li <shli@xxxxxxxxxx>
Signed-off-by: Barry Song <v-songbaohua@xxxxxxxx>
---
This RFC is inspired by the discussion in Yu Zhao's MGLRU:
https://lore.kernel.org/lkml/CAOUHufYvH2LaGyAJZFQNOsGDBKD2++aFnTV6=qaVtcNrKjS_bA@xxxxxxxxxxxxxx/

mm/rmap.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/mm/rmap.c b/mm/rmap.c
index 5bcb334cd6f2..7ce6f0b6c330 100644
--- a/mm/rmap.c
+++ b/mm/rmap.c
@@ -830,7 +830,7 @@ static bool folio_referenced_one(struct folio *folio,
}

if (pvmw.pte) {
- if (ptep_clear_flush_young_notify(vma, address,
+ if (ptep_clear_young_notify(vma, address,
pvmw.pte)) {
/*
* Don't treat a reference through
--
2.25.1