[tip:locking/core] [futex] cb8c4312af: will-it-scale.per_process_ops -3.2% regression

From: kernel test robot
Date: Sun Oct 08 2023 - 03:08:32 EST




Hello,

kernel test robot noticed a -3.2% regression of will-it-scale.per_process_ops on:


commit: cb8c4312afca1b2dc64107e7e7cea81911055612 ("futex: Add sys_futex_wait()")
https://git.kernel.org/cgit/linux/kernel/git/tip/tip.git locking/core

testcase: will-it-scale
test machine: 224 threads 4 sockets Intel(R) Xeon(R) Platinum 8380H CPU @ 2.90GHz (Cooper Lake) with 192G memory
parameters:

nr_task: 16
mode: process
test: futex4
cpufreq_governor: performance




If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <oliver.sang@xxxxxxxxx>
| Closes: https://lore.kernel.org/oe-lkp/202310081429.a30c99f2-oliver.sang@xxxxxxxxx


Details are as below:
-------------------------------------------------------------------------------------------------->


The kernel config and materials to reproduce are available at:
https://download.01.org/0day-ci/archive/20231008/202310081429.a30c99f2-oliver.sang@xxxxxxxxx

=========================================================================================
compiler/cpufreq_governor/kconfig/mode/nr_task/rootfs/tbox_group/test/testcase:
gcc-12/performance/x86_64-rhel-8.3/process/16/debian-11.1-x86_64-20220510.cgz/lkp-cpl-4sp2/futex4/will-it-scale

commit:
43adf84495 ("futex: FLAGS_STRICT")
cb8c4312af ("futex: Add sys_futex_wait()")

43adf844951084c2 cb8c4312afca1b2dc64107e7e7c
---------------- ---------------------------
%stddev %change %stddev
\ | \
1.339e+08 -3.2% 1.296e+08 will-it-scale.16.processes
8367312 -3.2% 8102637 will-it-scale.per_process_ops
1.339e+08 -3.2% 1.296e+08 will-it-scale.workload
0.61 -0.0 0.59 perf-stat.i.branch-miss-rate%
72599095 -2.7% 70647352 perf-stat.i.branch-misses
0.80 -1.8% 0.79 perf-stat.i.cpi
2.073e+10 +3.8% 2.152e+10 perf-stat.i.dTLB-loads
1.72e+10 +2.2% 1.757e+10 perf-stat.i.dTLB-stores
66739031 -5.4% 63102078 perf-stat.i.iTLB-load-misses
2080892 +2.4% 2131032 perf-stat.i.iTLB-loads
8.203e+10 +1.6% 8.337e+10 perf-stat.i.instructions
1231 +7.3% 1321 perf-stat.i.instructions-per-iTLB-miss
1.24 +1.8% 1.27 perf-stat.i.ipc
222.58 +2.4% 227.82 perf-stat.i.metric.M/sec
0.61 -0.0 0.59 perf-stat.overall.branch-miss-rate%
0.80 -1.8% 0.79 perf-stat.overall.cpi
1229 +7.5% 1321 perf-stat.overall.instructions-per-iTLB-miss
1.24 +1.8% 1.27 perf-stat.overall.ipc
184025 +4.9% 193123 perf-stat.overall.path-length
72373935 -2.7% 70427711 perf-stat.ps.branch-misses
2.066e+10 +3.8% 2.144e+10 perf-stat.ps.dTLB-loads
1.714e+10 +2.2% 1.751e+10 perf-stat.ps.dTLB-stores
66517376 -5.5% 62888454 perf-stat.ps.iTLB-load-misses
2073911 +2.4% 2123876 perf-stat.ps.iTLB-loads
8.175e+10 +1.6% 8.309e+10 perf-stat.ps.instructions
2.464e+13 +1.6% 2.504e+13 perf-stat.total.instructions
29.29 ± 2% -29.3 0.00 perf-profile.calltrace.cycles-pp.futex_wait_setup.futex_wait.do_futex.__x64_sys_futex.do_syscall_64
12.17 ± 2% -12.2 0.00 perf-profile.calltrace.cycles-pp.futex_q_lock.futex_wait_setup.futex_wait.do_futex.__x64_sys_futex
9.21 ± 2% -9.2 0.00 perf-profile.calltrace.cycles-pp.futex_get_value_locked.futex_wait_setup.futex_wait.do_futex.__x64_sys_futex
6.61 ± 2% -6.6 0.00 perf-profile.calltrace.cycles-pp.__get_user_nocheck_4.futex_get_value_locked.futex_wait_setup.futex_wait.do_futex
2.03 ± 2% -0.1 1.88 ± 3% perf-profile.calltrace.cycles-pp.syscall_exit_to_user_mode.do_syscall_64.entry_SYSCALL_64_after_hwframe.syscall
0.00 +2.0 1.98 ± 4% perf-profile.calltrace.cycles-pp.get_futex_key.futex_wait_setup.__futex_wait.futex_wait.do_futex
0.00 +4.0 3.96 ± 3% perf-profile.calltrace.cycles-pp.futex_q_unlock.futex_wait_setup.__futex_wait.futex_wait.do_futex
0.00 +4.1 4.09 ± 3% perf-profile.calltrace.cycles-pp._raw_spin_lock.futex_q_lock.futex_wait_setup.__futex_wait.futex_wait
0.00 +4.4 4.35 ± 3% perf-profile.calltrace.cycles-pp.futex_hash.futex_q_lock.futex_wait_setup.__futex_wait.futex_wait
0.00 +6.1 6.14 ± 3% perf-profile.calltrace.cycles-pp.__get_user_nocheck_4.futex_get_value_locked.futex_wait_setup.__futex_wait.futex_wait
0.00 +8.5 8.52 ± 3% perf-profile.calltrace.cycles-pp.futex_get_value_locked.futex_wait_setup.__futex_wait.futex_wait.do_futex
0.00 +11.3 11.27 ± 3% perf-profile.calltrace.cycles-pp.futex_q_lock.futex_wait_setup.__futex_wait.futex_wait.do_futex
0.00 +27.4 27.44 ± 3% perf-profile.calltrace.cycles-pp.futex_wait_setup.__futex_wait.futex_wait.do_futex.__x64_sys_futex
0.00 +31.3 31.33 ± 3% perf-profile.calltrace.cycles-pp.__futex_wait.futex_wait.do_futex.__x64_sys_futex.do_syscall_64
29.80 ± 2% -1.9 27.91 ± 3% perf-profile.children.cycles-pp.futex_wait_setup
12.68 ± 2% -0.9 11.74 ± 3% perf-profile.children.cycles-pp.futex_q_lock
7.49 ± 2% -0.6 6.93 ± 3% perf-profile.children.cycles-pp.__get_user_nocheck_4
4.38 ± 2% -0.4 3.96 ± 3% perf-profile.children.cycles-pp.futex_q_unlock
4.74 ± 2% -0.4 4.35 ± 3% perf-profile.children.cycles-pp.futex_hash
4.62 ± 2% -0.3 4.33 ± 3% perf-profile.children.cycles-pp._raw_spin_lock
0.48 ± 3% -0.2 0.32 ± 5% perf-profile.children.cycles-pp.futex_setup_timer
1.71 ± 2% -0.1 1.57 ± 4% perf-profile.children.cycles-pp.entry_SYSCALL_64_safe_stack
1.24 ± 3% -0.1 1.14 ± 4% perf-profile.children.cycles-pp.syscall_enter_from_user_mode
0.52 ± 5% -0.0 0.47 ± 4% perf-profile.children.cycles-pp.syscall_exit_to_user_mode_prepare
0.35 ± 3% -0.0 0.31 ± 3% perf-profile.children.cycles-pp.syscall@plt
0.00 +31.5 31.46 ± 3% perf-profile.children.cycles-pp.__futex_wait
7.88 ± 2% -2.4 5.48 ± 2% perf-profile.self.cycles-pp.futex_wait
10.37 ± 3% -0.9 9.46 ± 3% perf-profile.self.cycles-pp.syscall
7.46 ± 2% -0.6 6.91 ± 3% perf-profile.self.cycles-pp.__get_user_nocheck_4
4.20 ± 2% -0.4 3.78 ± 3% perf-profile.self.cycles-pp.futex_q_unlock
4.56 ± 2% -0.4 4.19 ± 3% perf-profile.self.cycles-pp.futex_hash
4.44 ± 2% -0.3 4.16 ± 3% perf-profile.self.cycles-pp._raw_spin_lock
3.54 ± 2% -0.2 3.29 ± 3% perf-profile.self.cycles-pp.futex_q_lock
1.71 ± 2% -0.1 1.57 ± 4% perf-profile.self.cycles-pp.entry_SYSCALL_64_safe_stack
0.40 ± 3% -0.1 0.32 ± 5% perf-profile.self.cycles-pp.futex_setup_timer
1.18 -0.1 1.10 ± 3% perf-profile.self.cycles-pp.do_syscall_64
1.00 -0.1 0.94 ± 4% perf-profile.self.cycles-pp.syscall_exit_to_user_mode
2.14 ± 3% +0.2 2.31 ± 3% perf-profile.self.cycles-pp.__x64_sys_futex
0.00 +3.5 3.50 ± 3% perf-profile.self.cycles-pp.__futex_wait




Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.


--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki