Re: [PATCH v2 3/3] mm: rmap: support batched unmapping for file large folios

From: Baolin Wang

Date: Tue Dec 16 2025 - 00:49:59 EST




On 2025/12/15 20:38, Lorenzo Stoakes wrote:
On Thu, Dec 11, 2025 at 04:16:56PM +0800, Baolin Wang wrote:
Similar to folio_referenced_one(), we can apply batched unmapping for file
large folios to optimize the performance of file folios reclamation.

Performance testing:
Allocate 10G clean file-backed folios by mmap() in a memory cgroup, and try to
reclaim 8G file-backed folios via the memory.reclaim interface. I can observe
75% performance improvement on my Arm64 32-core server.

Again, you must test on non-arm64 architectures and report the numbers for this
also.

Yes, I've tested on the x86 machine, and will add the data in the commit message.

W/o patch:
real 0m1.018s
user 0m0.000s
sys 0m1.018s

W/ patch:
real 0m0.249s
user 0m0.000s
sys 0m0.249s

Signed-off-by: Baolin Wang <baolin.wang@xxxxxxxxxxxxxxxxx>
---
mm/rmap.c | 7 ++++---
1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/mm/rmap.c b/mm/rmap.c
index ec232165c47d..4c9d5777c8da 100644
--- a/mm/rmap.c
+++ b/mm/rmap.c
@@ -1855,9 +1855,10 @@ static inline unsigned int folio_unmap_pte_batch(struct folio *folio,
end_addr = pmd_addr_end(addr, vma->vm_end);
max_nr = (end_addr - addr) >> PAGE_SHIFT;

- /* We only support lazyfree batching for now ... */
- if (!folio_test_anon(folio) || folio_test_swapbacked(folio))
+ /* We only support lazyfree or file folios batching for now ... */
+ if (folio_test_anon(folio) && folio_test_swapbacked(folio))

Why is it now ok to support file-backed batched unmapping when it wasn't in
Barry's series (see [0])? You don't seem to be justifying this?

Barry's series[0] is merely aimed at optimizing lazyfree anonymous large folios and does not continue to optimize anonymous large folios or file-backed large folios at that point.

Subsequently, Barry sent out a new patch (see [1]) to optimize anonymous large folios. As for file-backed large folios, the batched unmapping support is relatively simple, since we only need to clear the PTE entries for file-backed large folios.

[0]:https://lore.kernel.org/all/20250214093015.51024-4-21cnbao@xxxxxxxxx/T/#u

[1] https://lore.kernel.org/all/20250513084620.58231-1-21cnbao@xxxxxxxxx/

return 1;
+
if (pte_unused(pte))
return 1;

@@ -2223,7 +2224,7 @@ static bool try_to_unmap_one(struct folio *folio, struct vm_area_struct *vma,
*
* See Documentation/mm/mmu_notifier.rst
*/
- dec_mm_counter(mm, mm_counter_file(folio));
+ add_mm_counter(mm, mm_counter_file(folio), -nr_pages);

Was this just a bug before?

Nope. Before this patch, we never supported batched unmapping for file-backed large folios, so the 'nr_pages' was always 1. After this patch, we should use the number of pages in this file-backed large folio.