Re: [PATCH] mm: huge_memory: add folio_mark_accessed() when zapping file THP

From: Baolin Wang
Date: Thu Apr 10 2025 - 21:07:31 EST




On 2025/4/10 16:45, Oscar Salvador wrote:
On Wed, Apr 09, 2025 at 05:38:58PM +0800, Baolin Wang wrote:
When investigating performance issues during file folio unmap, I noticed some
behavioral differences in handling non-PMD-sized folios and PMD-sized folios.
For non-PMD-sized file folios, it will call folio_mark_accessed() to mark the
folio as having seen activity, but this is not done for PMD-sized folios.

This might not cause obvious issues, but a potential problem could be that,
it might lead to reclaim hot file folios under memory pressure, as quoted
from Johannes:

"
Sometimes file contents are only accessed through relatively short-lived
mappings. But they can nevertheless be accessed a lot and be hot. It's
important to not lose that information on unmap, and end up kicking out a
frequently used cache page.
"

Therefore, we should also add folio_mark_accessed() for PMD-sized file
folios when unmapping.

Signed-off-by: Baolin Wang <baolin.wang@xxxxxxxxxxxxxxxxx>
Acked-by: Johannes Weiner <hannes@xxxxxxxxxxx>
Acked-by: Zi Yan <ziy@xxxxxxxxxx>

Reviewed-by: Oscar Salvador <osalvador@xxxxxxx>

Thanks.

Although I agree with David here that pmd_present would be more obvious than
flush_needed.
It was not obvious to be at first glance.

How about adding some comments to make it clear?

diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index b3ade7ac5bbf..93abd1fcc4fb 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -2263,6 +2263,10 @@ int zap_huge_pmd(struct mmu_gather *tlb, struct vm_area_struct *vma,
add_mm_counter(tlb->mm, mm_counter_file(folio),
-HPAGE_PMD_NR);

+ /*
+ * Use flush_needed to indicate whether the PMD entry is present,
+ * instead of checking pmd_present() again.
+ */
if (flush_needed && pmd_young(orig_pmd) &&
likely(vma_has_recency(vma)))
folio_mark_accessed(folio);