[linus:master] [mm] 9890ecab6a: vm-scalability.throughput 3.8% regression

From: kernel test robot

Date: Tue Mar 10 2026 - 02:39:53 EST




Hello,

kernel test robot noticed a 3.8% regression of vm-scalability.throughput on:


commit: 9890ecab6ad9c0d3d342469f3b619fd704b5c59a ("mm: folio_zero_user: clear pages sequentially")
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master

[still regression on linus/master 0031c06807cfa8aa51a759ff8aa09e1aa48149af]
[still regression on linux-next/master c025f6cf4209e1542ec2afebe49f42bbaf1a5c7b]

testcase: vm-scalability
config: x86_64-rhel-9.4
compiler: gcc-14
test machine: 224 threads 4 sockets Intel(R) Xeon(R) Platinum 8380H CPU @ 2.90GHz (Cooper Lake) with 192G memory
parameters:

runtime: 300s
size: 8T
test: anon-w-seq-mt
cpufreq_governor: performance



If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <oliver.sang@xxxxxxxxx>
| Closes: https://lore.kernel.org/oe-lkp/202603101342.297fb270-lkp@xxxxxxxxx


Details are as below:
-------------------------------------------------------------------------------------------------->


The kernel config and materials to reproduce are available at:
https://download.01.org/0day-ci/archive/20260310/202603101342.297fb270-lkp@xxxxxxxxx

=========================================================================================
compiler/cpufreq_governor/kconfig/rootfs/runtime/size/tbox_group/test/testcase:
gcc-14/performance/x86_64-rhel-9.4/debian-13-x86_64-20250902.cgz/300s/8T/lkp-cpl-4sp2/anon-w-seq-mt/vm-scalability

commit:
cb431accb3 ("x86/clear_page: introduce clear_pages()")
9890ecab6a ("mm: folio_zero_user: clear pages sequentially")

cb431accb36e51b6 9890ecab6ad9c0d3d342469f3b6
---------------- ---------------------------
%stddev %change %stddev
\ | \
0.08 ± 3% +8.7% 0.09 ± 3% vm-scalability.free_time
357969 -6.6% 334511 vm-scalability.median
1.034e+08 -3.8% 99382138 vm-scalability.throughput
634243 -13.6% 548120 ± 6% vm-scalability.time.involuntary_context_switches
12706518 -6.6% 11872543 vm-scalability.time.minor_page_faults
15142 -4.2% 14512 vm-scalability.time.system_time
16939 +6.1% 17975 vm-scalability.time.user_time
251227 -6.8% 234071 vm-scalability.time.voluntary_context_switches
1.791e+10 -6.6% 1.674e+10 vm-scalability.workload
0.30 -7.5% 0.28 turbostat.IPC
9203 -5.5% 8693 vmstat.system.cs
0.08 +0.0 0.08 mpstat.cpu.all.soft%
25.14 +1.5 26.62 mpstat.cpu.all.usr%
3.13 +18.3% 3.71 perf-stat.i.MPKI
6.22e+10 -6.6% 5.81e+10 perf-stat.i.branch-instructions
61.69 +9.8 71.51 perf-stat.i.cache-miss-rate%
6.147e+08 +10.7% 6.805e+08 perf-stat.i.cache-misses
9.904e+08 -4.7% 9.436e+08 perf-stat.i.cache-references
9303 -5.2% 8823 perf-stat.i.context-switches
2.17 +8.2% 2.35 perf-stat.i.cpi
598.97 -4.6% 571.28 perf-stat.i.cpu-migrations
1.95e+11 -6.6% 1.822e+11 perf-stat.i.instructions
0.47 -7.2% 0.43 perf-stat.i.ipc
43153 -6.5% 40334 perf-stat.i.minor-faults
43153 -6.5% 40335 perf-stat.i.page-faults
3.16 +18.5% 3.74 perf-stat.overall.MPKI
0.02 +0.0 0.03 perf-stat.overall.branch-miss-rate%
62.11 +10.1 72.19 perf-stat.overall.cache-miss-rate%
2.19 +8.3% 2.37 perf-stat.overall.cpi
692.89 -8.6% 633.07 perf-stat.overall.cycles-between-cache-misses
0.46 -7.6% 0.42 perf-stat.overall.ipc
6.121e+10 -6.8% 5.705e+10 perf-stat.ps.branch-instructions
6.054e+08 +10.5% 6.689e+08 perf-stat.ps.cache-misses
9.747e+08 -4.9% 9.266e+08 perf-stat.ps.cache-references
9124 -5.6% 8613 perf-stat.ps.context-switches
583.66 -4.9% 555.21 perf-stat.ps.cpu-migrations
1.919e+11 -6.8% 1.789e+11 perf-stat.ps.instructions
42389 -6.7% 39549 perf-stat.ps.minor-faults
42389 -6.7% 39549 perf-stat.ps.page-faults
5.812e+13 -6.5% 5.434e+13 perf-stat.total.instructions
40.26 -40.3 0.00 perf-profile.calltrace.cycles-pp.clear_subpage.folio_zero_user.vma_alloc_anon_folio_pmd.__do_huge_pmd_anonymous_page.__handle_mm_fault
40.76 -2.1 38.66 perf-profile.calltrace.cycles-pp.folio_zero_user.vma_alloc_anon_folio_pmd.__do_huge_pmd_anonymous_page.__handle_mm_fault.handle_mm_fault
42.59 -2.0 40.61 perf-profile.calltrace.cycles-pp.asm_exc_page_fault.do_access
42.54 -2.0 40.57 perf-profile.calltrace.cycles-pp.do_user_addr_fault.exc_page_fault.asm_exc_page_fault.do_access
42.54 -2.0 40.57 perf-profile.calltrace.cycles-pp.exc_page_fault.asm_exc_page_fault.do_access
42.40 -2.0 40.43 perf-profile.calltrace.cycles-pp.handle_mm_fault.do_user_addr_fault.exc_page_fault.asm_exc_page_fault.do_access
42.32 -2.0 40.36 perf-profile.calltrace.cycles-pp.__handle_mm_fault.handle_mm_fault.do_user_addr_fault.exc_page_fault.asm_exc_page_fault
42.23 -2.0 40.27 perf-profile.calltrace.cycles-pp.__do_huge_pmd_anonymous_page.__handle_mm_fault.handle_mm_fault.do_user_addr_fault.exc_page_fault
41.70 -2.0 39.74 perf-profile.calltrace.cycles-pp.vma_alloc_anon_folio_pmd.__do_huge_pmd_anonymous_page.__handle_mm_fault.handle_mm_fault.do_user_addr_fault
0.76 -0.0 0.72 perf-profile.calltrace.cycles-pp.vma_alloc_folio_noprof.vma_alloc_anon_folio_pmd.__do_huge_pmd_anonymous_page.__handle_mm_fault.handle_mm_fault
0.72 -0.0 0.68 perf-profile.calltrace.cycles-pp.__alloc_frozen_pages_noprof.alloc_pages_mpol.vma_alloc_folio_noprof.vma_alloc_anon_folio_pmd.__do_huge_pmd_anonymous_page
0.67 -0.0 0.64 perf-profile.calltrace.cycles-pp.get_page_from_freelist.__alloc_frozen_pages_noprof.alloc_pages_mpol.vma_alloc_folio_noprof.vma_alloc_anon_folio_pmd
0.72 -0.0 0.69 perf-profile.calltrace.cycles-pp.alloc_pages_mpol.vma_alloc_folio_noprof.vma_alloc_anon_folio_pmd.__do_huge_pmd_anonymous_page.__handle_mm_fault
0.56 -0.0 0.54 perf-profile.calltrace.cycles-pp.prep_new_page.get_page_from_freelist.__alloc_frozen_pages_noprof.alloc_pages_mpol.vma_alloc_folio_noprof
0.00 +0.8 0.76 ± 2% perf-profile.calltrace.cycles-pp.asm_sysvec_apic_timer_interrupt.folio_zero_user.vma_alloc_anon_folio_pmd.__do_huge_pmd_anonymous_page.__handle_mm_fault
30.25 +1.2 31.46 perf-profile.calltrace.cycles-pp.do_rw_once
40.49 -40.5 0.00 perf-profile.children.cycles-pp.clear_subpage
42.61 -2.0 40.63 perf-profile.children.cycles-pp.asm_exc_page_fault
42.55 -2.0 40.58 perf-profile.children.cycles-pp.exc_page_fault
42.54 -2.0 40.57 perf-profile.children.cycles-pp.do_user_addr_fault
42.40 -2.0 40.43 perf-profile.children.cycles-pp.handle_mm_fault
42.33 -2.0 40.36 perf-profile.children.cycles-pp.__handle_mm_fault
42.23 -2.0 40.27 perf-profile.children.cycles-pp.__do_huge_pmd_anonymous_page
41.70 -2.0 39.74 perf-profile.children.cycles-pp.vma_alloc_anon_folio_pmd
40.83 -1.9 38.92 perf-profile.children.cycles-pp.folio_zero_user
63.93 -1.2 62.77 perf-profile.children.cycles-pp.do_access
0.95 -0.0 0.91 perf-profile.children.cycles-pp.__alloc_frozen_pages_noprof
0.78 -0.0 0.74 perf-profile.children.cycles-pp.vma_alloc_folio_noprof
0.95 -0.0 0.92 perf-profile.children.cycles-pp.alloc_pages_mpol
0.79 -0.0 0.76 perf-profile.children.cycles-pp.get_page_from_freelist
0.63 -0.0 0.60 perf-profile.children.cycles-pp.prep_new_page
40.31 +2.5 42.80 perf-profile.children.cycles-pp.do_rw_once
39.77 -39.8 0.00 perf-profile.self.cycles-pp.clear_subpage
9.54 -0.3 9.23 perf-profile.self.cycles-pp.do_access
0.55 -0.0 0.53 perf-profile.self.cycles-pp.prep_new_page
38.35 +2.6 40.96 perf-profile.self.cycles-pp.do_rw_once
0.36 ± 2% +38.0 38.32 perf-profile.self.cycles-pp.folio_zero_user




Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.


--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki