In preparation to switch vmstat shepherd to flush
per-CPU counters remotely, switch all functions that
modify the counters to use cmpxchg.
To test the performance difference, a page allocator microbenchmark:
https://github.com/netoptimizer/prototype-kernel/blob/master/kernel/mm/bench/page_bench01.c
with loops=1000000 was used, on Intel Core i7-11850H @ 2.50GHz.
For the single_page_alloc_free test, which does
/** Loop to measure **/
for (i = 0; i < rec->loops; i++) {
my_page = alloc_page(gfp_mask);
if (unlikely(my_page == NULL))
return 0;
__free_page(my_page);
}
Unit is cycles.
Vanilla Patched Diff
159 165 3.7%
Signed-off-by: Marcelo Tosatti <mtosatti@xxxxxxxxxx>
Index: linux-vmstat-remote/mm/vmstat.c
===================================================================
--- linux-vmstat-remote.orig/mm/vmstat.c
+++ linux-vmstat-remote/mm/vmstat.c
@@ -334,6 +334,188 @@ void set_pgdat_percpu_threshold(pg_data_
}
}