Re: [PATCH 2/2] ksm: Optimize rmap_walk_ksm by passing a suitable address range

From: David Hildenbrand (Red Hat)

Date: Wed Jan 14 2026 - 05:25:00 EST


On 1/14/26 03:40, xu.xin16@xxxxxxxxxx wrote:
Solution
========
In fact, we can significantly improve performance by passing a more precise
range based on the given addr. Since the original pages merged by KSM
correspond to anonymous VMAs, the page offset can be calculated as
pgoff = address >> PAGE_SHIFT. Therefore, we can optimize the call by
defining:

pgoff_start = rmap_item->address >> PAGE_SHIFT;
pgoff_end = pgoff_start + folio_nr_pages(folio) - 1;

Performance
===========
In our real embedded Linux environment, the measured metrcis were as follows:

1) Time_ms: Max time for holding anon_vma lock in a single rmap_walk_ksm.
2) Nr_iteration_total: The max times of iterations in a loop of anon_vma_interval_tree_foreach
3) Skip_addr_out_of_range: The max times of skipping due to the first check (vma->vm_start
and vma->vm_end) in a loop of anon_vma_interval_tree_foreach.
4) Skip_mm_mismatch: The max times of skipping due to the second check (rmap_item->mm == vma->vm_mm)
in a loop of anon_vma_interval_tree_foreach.

The result is as follows:

Time_ms Nr_iteration_total Skip_addr_out_of_range Skip_mm_mismatch
Before patched: 228.65 22169 22168 0
After pacthed: 0.396 3 0 2

Nice improvement.

Can you make your reproducer available?

I'll do my best to try it. The original test data was derived from real business scenarios,
but it's quite complex. I'll try to simplify this high-latency scenario into a more
understandable demo as a reproduction program.

Ah, I thought it was some benchmark ran on an embedded environment.

How did you end up measuring these numbers?




Co-developed-by: Wang Yaxin <wang.yaxin@xxxxxxxxxx>
Signed-off-by: xu xin <xu.xin16@xxxxxxxxxx>
---
mm/ksm.c | 6 +++++-
1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/mm/ksm.c b/mm/ksm.c
index 335e7151e4a1..0a074ad8e867 100644
--- a/mm/ksm.c
+++ b/mm/ksm.c
@@ -3172,6 +3172,7 @@ void rmap_walk_ksm(struct folio *folio, struct rmap_walk_control *rwc)
struct anon_vma_chain *vmac;
struct vm_area_struct *vma;
unsigned long addr;
+ pgoff_t pgoff_start, pgoff_end;

cond_resched();
if (!anon_vma_trylock_read(anon_vma)) {
@@ -3185,8 +3186,11 @@ void rmap_walk_ksm(struct folio *folio, struct rmap_walk_control *rwc)
/* Ignore the stable/unstable/sqnr flags */
addr = rmap_item->address & PAGE_MASK;

+ pgoff_start = rmap_item->address >> PAGE_SHIFT;
+ pgoff_end = pgoff_start + folio_nr_pages(folio) - 1;

KSM folios are always order-0, so you can keep it simple and hard-code
PAGE_SIZE here.

You can also initialize both values directly and make them const.

Yes, I'll do it in v2.

To me, this looks reasonable, but getting some eyes from people more familiar with KSM+interval-tree handling would be great. (CCing also Rik)

--
Cheers

David