[linus:master] [x86/stackprotector] f3856cd343: will-it-scale.per_thread_ops 1.9% regression
From: kernel test robot
Date: Mon Apr 07 2025 - 22:53:54 EST
Hello,
kernel test robot noticed a 1.9% regression of will-it-scale.per_thread_ops on:
commit: f3856cd343b6371530c9af3c97354cdc003f3203 ("x86/stackprotector: Move __stack_chk_guard to percpu hot section")
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master
testcase: will-it-scale
config: x86_64-rhel-9.4
compiler: gcc-12
test machine: 384 threads 2 sockets Intel(R) Xeon(R) 6972P (Granite Rapids) with 128G memory
parameters:
nr_task: 100%
mode: thread
test: futex4
cpufreq_governor: performance
If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <oliver.sang@xxxxxxxxx>
| Closes: https://lore.kernel.org/oe-lkp/202504081055.4c396e03-lkp@xxxxxxxxx
Details are as below:
-------------------------------------------------------------------------------------------------->
The kernel config and materials to reproduce are available at:
https://download.01.org/0day-ci/archive/20250408/202504081055.4c396e03-lkp@xxxxxxxxx
=========================================================================================
compiler/cpufreq_governor/kconfig/mode/nr_task/rootfs/tbox_group/test/testcase:
gcc-12/performance/x86_64-rhel-9.4/thread/100%/debian-12-x86_64-20240206.cgz/lkp-gnr-2ap2/futex4/will-it-scale
commit:
a1e4cc0155 ("x86/percpu: Move current_task to percpu hot section")
f3856cd343 ("x86/stackprotector: Move __stack_chk_guard to percpu hot section")
a1e4cc0155ad577a f3856cd343b6371530c9af3c973
---------------- ---------------------------
%stddev %change %stddev
\ | \
0.84 +2.1% 0.86 perf-stat.overall.cpi
1.19 -2.0% 1.17 perf-stat.overall.ipc
4.785e+14 -1.9% 4.692e+14 perf-stat.total.instructions
2.484e+09 -1.9% 2.435e+09 will-it-scale.384.threads
6467627 -1.9% 6342224 will-it-scale.per_thread_ops
2.484e+09 -1.9% 2.435e+09 will-it-scale.workload
40.20 -1.3 38.94 perf-profile.calltrace.cycles-pp.futex_wait.do_futex.__x64_sys_futex.do_syscall_64.entry_SYSCALL_64_after_hwframe
35.54 -1.3 34.29 perf-profile.calltrace.cycles-pp.__futex_wait.futex_wait.do_futex.__x64_sys_futex.do_syscall_64
29.67 -1.2 28.44 perf-profile.calltrace.cycles-pp.futex_wait_setup.__futex_wait.futex_wait.do_futex.__x64_sys_futex
42.56 -1.1 41.49 perf-profile.calltrace.cycles-pp.do_futex.__x64_sys_futex.do_syscall_64.entry_SYSCALL_64_after_hwframe.syscall
46.26 -1.0 45.31 perf-profile.calltrace.cycles-pp.__x64_sys_futex.do_syscall_64.entry_SYSCALL_64_after_hwframe.syscall
31.33 -0.8 30.52 perf-profile.calltrace.cycles-pp.entry_SYSCALL_64.syscall
65.81 -0.7 65.08 perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.syscall
62.78 -0.7 62.12 perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.syscall
4.94 -0.3 4.67 perf-profile.calltrace.cycles-pp.futex_q_unlock.futex_wait_setup.__futex_wait.futex_wait.do_futex
2.62 -0.2 2.42 perf-profile.calltrace.cycles-pp.get_futex_key.futex_wait_setup.__futex_wait.futex_wait.do_futex
3.62 -0.1 3.49 perf-profile.calltrace.cycles-pp.futex_hash.futex_q_lock.futex_wait_setup.__futex_wait.futex_wait
1.69 -0.1 1.56 perf-profile.calltrace.cycles-pp.testcase
1.06 -0.0 1.04 perf-profile.calltrace.cycles-pp.syscall_exit_to_user_mode_prepare.syscall_exit_to_user_mode.do_syscall_64.entry_SYSCALL_64_after_hwframe.syscall
1.91 +0.1 1.99 perf-profile.calltrace.cycles-pp.syscall_return_via_sysret.syscall
2.21 +0.1 2.34 perf-profile.calltrace.cycles-pp.x64_sys_call.do_syscall_64.entry_SYSCALL_64_after_hwframe.syscall
90.29 +0.2 90.49 perf-profile.calltrace.cycles-pp.syscall
1.13 +2.6 3.68 perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_safe_stack.syscall
30.76 -1.3 29.47 perf-profile.children.cycles-pp.futex_wait_setup
36.03 -1.3 34.77 perf-profile.children.cycles-pp.__futex_wait
40.41 -1.3 39.16 perf-profile.children.cycles-pp.futex_wait
43.41 -1.0 42.37 perf-profile.children.cycles-pp.do_futex
46.83 -0.9 45.91 perf-profile.children.cycles-pp.__x64_sys_futex
66.21 -0.7 65.55 perf-profile.children.cycles-pp.entry_SYSCALL_64_after_hwframe
63.79 -0.7 63.13 perf-profile.children.cycles-pp.do_syscall_64
7.52 -0.3 7.20 perf-profile.children.cycles-pp.entry_SYSRETQ_unsafe_stack
5.16 -0.3 4.90 perf-profile.children.cycles-pp.futex_q_unlock
2.79 -0.2 2.58 perf-profile.children.cycles-pp.get_futex_key
1.99 -0.1 1.84 perf-profile.children.cycles-pp.testcase
3.78 -0.1 3.64 perf-profile.children.cycles-pp.futex_hash
0.56 -0.0 0.51 perf-profile.children.cycles-pp.syscall@plt
2.18 +0.1 2.25 perf-profile.children.cycles-pp.syscall_return_via_sysret
2.49 +0.1 2.62 perf-profile.children.cycles-pp.x64_sys_call
17.32 +0.6 17.89 perf-profile.children.cycles-pp.entry_SYSCALL_64
1.16 +1.5 2.70 perf-profile.children.cycles-pp.entry_SYSCALL_64_safe_stack
13.43 -0.7 12.74 perf-profile.self.cycles-pp.futex_wait_setup
17.52 -0.4 17.13 perf-profile.self.cycles-pp.syscall
7.25 -0.3 6.94 perf-profile.self.cycles-pp.entry_SYSRETQ_unsafe_stack
4.87 ± 2% -0.3 4.62 perf-profile.self.cycles-pp.futex_q_unlock
2.51 -0.2 2.32 perf-profile.self.cycles-pp.get_futex_key
3.55 -0.1 3.42 perf-profile.self.cycles-pp.futex_hash
1.76 -0.1 1.64 perf-profile.self.cycles-pp.testcase
0.25 -0.0 0.23 perf-profile.self.cycles-pp.syscall@plt
2.61 +0.0 2.66 perf-profile.self.cycles-pp.entry_SYSCALL_64_after_hwframe
2.18 +0.1 2.25 perf-profile.self.cycles-pp.syscall_return_via_sysret
3.42 +0.1 3.53 perf-profile.self.cycles-pp.__x64_sys_futex
2.81 +0.1 2.93 perf-profile.self.cycles-pp._raw_spin_lock
2.20 +0.1 2.33 perf-profile.self.cycles-pp.x64_sys_call
4.62 +0.2 4.77 perf-profile.self.cycles-pp.do_syscall_64
3.27 +0.2 3.49 perf-profile.self.cycles-pp.do_futex
1.14 +0.5 1.68 perf-profile.self.cycles-pp.entry_SYSCALL_64_safe_stack
3.30 +0.9 4.24 perf-profile.self.cycles-pp.entry_SYSCALL_64
Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.
--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki