[linus:master] [kasan] ae193dd793: stress-ng.access.access_calls_per_sec 5.8% regression
From: kernel test robot
Date: Fri Jan 31 2025 - 04:12:10 EST
Hello,
kernel test robot noticed a 5.8% regression of stress-ng.access.access_calls_per_sec on:
commit: ae193dd79398970ee760e0c8129ac42ef8f5c6ff ("kasan: move checks to do_strncpy_from_user")
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master
[test failed on linus/master 72deda0abee6e705ae71a93f69f55e33be5bca5c]
[test failed on linux-next/master a13f6e0f405ed0d3bcfd37c692c7d7fa3c052154]
testcase: stress-ng
config: x86_64-rhel-9.4
compiler: gcc-12
test machine: 64 threads 2 sockets Intel(R) Xeon(R) Gold 6346 CPU @ 3.10GHz (Ice Lake) with 256G memory
parameters:
nr_threads: 100%
disk: 1HDD
testtime: 60s
fs: btrfs
test: access
cpufreq_governor: performance
If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <oliver.sang@xxxxxxxxx>
| Closes: https://lore.kernel.org/oe-lkp/202501311634.4f80fa4f-lkp@xxxxxxxxx
Details are as below:
-------------------------------------------------------------------------------------------------->
The kernel config and materials to reproduce are available at:
https://download.01.org/0day-ci/archive/20250131/202501311634.4f80fa4f-lkp@xxxxxxxxx
=========================================================================================
compiler/cpufreq_governor/disk/fs/kconfig/nr_threads/rootfs/tbox_group/test/testcase/testtime:
gcc-12/performance/1HDD/btrfs/x86_64-rhel-9.4/100%/debian-12-x86_64-20240206.cgz/lkp-icl-2sp8/access/stress-ng/60s
commit:
aaf2914aec ("mm: add per-order mTHP swpin counters")
ae193dd793 ("kasan: move checks to do_strncpy_from_user")
aaf2914aec0fa673 ae193dd79398970ee760e0c8129
---------------- ---------------------------
%stddev %change %stddev
\ | \
3600570 -5.8% 3392014 stress-ng.access.access_calls_per_sec
0.47 -0.0 0.46 perf-stat.i.branch-miss-rate%
0.45 -0.0 0.43 perf-stat.overall.branch-miss-rate%
5.75 ± 17% -47.9% 3.00 ± 56% perf-sched.sch_delay.avg.ms.irqentry_exit_to_user_mode.asm_sysvec_call_function_single.[unknown]
11.49 ± 17% -47.9% 5.99 ± 56% perf-sched.wait_and_delay.avg.ms.irqentry_exit_to_user_mode.asm_sysvec_call_function_single.[unknown]
5.75 ± 17% -47.9% 3.00 ± 56% perf-sched.wait_time.avg.ms.irqentry_exit_to_user_mode.asm_sysvec_call_function_single.[unknown]
33.35 ±115% +222.2% 107.45 ± 56% perf-sched.wait_time.max.ms.__cond_resched.generic_perform_write.shmem_file_write_iter.vfs_write.ksys_write
13.95 ± 19% +32.5% 18.49 ± 12% perf-sched.wait_time.max.ms.__cond_resched.shmem_get_folio_gfp.shmem_write_begin.generic_perform_write.shmem_file_write_iter
39.46 -1.1 38.39 perf-profile.calltrace.cycles-pp.btrfs_dirty_inode.btrfs_setattr.notify_change.chmod_common.__x64_sys_fchmod
39.64 -1.1 38.57 perf-profile.calltrace.cycles-pp.chmod_common.__x64_sys_fchmod.do_syscall_64.entry_SYSCALL_64_after_hwframe.fchmod
39.66 -1.1 38.59 perf-profile.calltrace.cycles-pp.__x64_sys_fchmod.do_syscall_64.entry_SYSCALL_64_after_hwframe.fchmod
39.52 -1.1 38.46 perf-profile.calltrace.cycles-pp.btrfs_setattr.notify_change.chmod_common.__x64_sys_fchmod.do_syscall_64
39.71 -1.1 38.64 perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.fchmod
39.58 -1.1 38.52 perf-profile.calltrace.cycles-pp.notify_change.chmod_common.__x64_sys_fchmod.do_syscall_64.entry_SYSCALL_64_after_hwframe
39.71 -1.1 38.65 perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.fchmod
39.81 -1.1 38.75 perf-profile.calltrace.cycles-pp.fchmod
1.24 ± 2% +0.1 1.39 ± 4% perf-profile.calltrace.cycles-pp.user_path_at.do_faccessat.do_syscall_64.entry_SYSCALL_64_after_hwframe.faccessat
1.49 ± 2% +0.1 1.64 ± 4% perf-profile.calltrace.cycles-pp.do_faccessat.do_syscall_64.entry_SYSCALL_64_after_hwframe.faccessat
1.61 ± 2% +0.1 1.76 ± 4% perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.faccessat
1.64 ± 2% +0.1 1.78 ± 4% perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.faccessat
2.01 ± 2% +0.2 2.16 ± 4% perf-profile.calltrace.cycles-pp.faccessat
1.61 ± 2% +0.2 1.77 ± 4% perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.syscall
1.64 ± 2% +0.2 1.80 ± 4% perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.syscall
2.08 ± 2% +0.2 2.25 ± 4% perf-profile.calltrace.cycles-pp.syscall
1.45 ± 2% +0.2 1.61 ± 4% perf-profile.calltrace.cycles-pp.do_faccessat.do_syscall_64.entry_SYSCALL_64_after_hwframe.syscall
1.21 ± 3% +0.2 1.38 ± 4% perf-profile.calltrace.cycles-pp.user_path_at.do_faccessat.do_syscall_64.entry_SYSCALL_64_after_hwframe.syscall
48.31 +0.8 49.06 perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.chmod
48.30 +0.8 49.06 perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.chmod
48.25 +0.8 49.00 perf-profile.calltrace.cycles-pp.__x64_sys_chmod.do_syscall_64.entry_SYSCALL_64_after_hwframe.chmod
48.43 +0.8 49.18 perf-profile.calltrace.cycles-pp.chmod
39.66 -1.1 38.59 perf-profile.children.cycles-pp.__x64_sys_fchmod
39.83 -1.1 38.77 perf-profile.children.cycles-pp.fchmod
0.00 +0.1 0.11 ± 6% perf-profile.children.cycles-pp.__check_heap_object
0.00 +0.1 0.11 ± 6% perf-profile.children.cycles-pp.__virt_addr_valid
2.08 ± 2% +0.2 2.23 ± 4% perf-profile.children.cycles-pp.faccessat
2.16 ± 2% +0.2 2.33 ± 3% perf-profile.children.cycles-pp.syscall
0.00 +0.2 0.23 ± 5% perf-profile.children.cycles-pp.check_heap_object
4.40 ± 3% +0.4 4.83 ± 4% perf-profile.children.cycles-pp.do_faccessat
0.44 ± 2% +0.4 0.87 ± 4% perf-profile.children.cycles-pp.strncpy_from_user
0.83 ± 3% +0.4 1.27 ± 4% perf-profile.children.cycles-pp.getname_flags
0.00 +0.5 0.45 ± 4% perf-profile.children.cycles-pp.__check_object_size
4.09 ± 3% +0.5 4.56 ± 4% perf-profile.children.cycles-pp.user_path_at
48.25 +0.8 49.01 perf-profile.children.cycles-pp.__x64_sys_chmod
48.44 +0.8 49.20 perf-profile.children.cycles-pp.chmod
0.00 +0.1 0.08 ± 5% perf-profile.self.cycles-pp.__check_object_size
0.00 +0.1 0.09 ± 5% perf-profile.self.cycles-pp.check_heap_object
0.00 +0.1 0.10 ± 4% perf-profile.self.cycles-pp.__check_heap_object
0.00 +0.1 0.10 ± 4% perf-profile.self.cycles-pp.__virt_addr_valid
Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.
--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki