[jlayton:mgtime] [fs] 0dd26047b0: unixbench.throughput -1.5% regression

From: kernel test robot
Date: Thu Jun 06 2024 - 01:16:38 EST




Hello,

kernel test robot noticed a -1.5% regression of unixbench.throughput on:


commit: 0dd26047b0b803f7a196f0aee91d22116fdb82d3 ("fs: add tracepoints around multigrain timestamp changes")
https://git.kernel.org/cgit/linux/kernel/git/jlayton/linux.git mgtime

testcase: unixbench
test machine: 64 threads 2 sockets Intel(R) Xeon(R) Gold 6346 CPU @ 3.10GHz (Ice Lake) with 256G memory
parameters:

runtime: 300s
nr_task: 100%
test: pipe
cpufreq_governor: performance




If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <oliver.sang@xxxxxxxxx>
| Closes: https://lore.kernel.org/oe-lkp/202406061240.e2b14eaf-oliver.sang@xxxxxxxxx


Details are as below:
-------------------------------------------------------------------------------------------------->


The kernel config and materials to reproduce are available at:
https://download.01.org/0day-ci/archive/20240606/202406061240.e2b14eaf-oliver.sang@xxxxxxxxx

=========================================================================================
compiler/cpufreq_governor/kconfig/nr_task/rootfs/runtime/tbox_group/test/testcase:
gcc-13/performance/x86_64-rhel-8.3/100%/debian-12-x86_64-20240206.cgz/300s/lkp-icl-2sp9/pipe/unixbench

commit:
800a833112 ("fs: have setattr_copy handle multigrain timestamps appropriately")
0dd26047b0 ("fs: add tracepoints around multigrain timestamp changes")

800a83311219f2ca 0dd26047b0b803f7a196f0aee91
---------------- ---------------------------
%stddev %change %stddev
\ | \
16773461 ± 5% -19.9% 13438976 ± 7% meminfo.DirectMap2M
24818 ± 47% +64.2% 40751 ± 25% numa-numastat.node0.other_node
24819 ± 47% +64.2% 40751 ± 25% numa-vmstat.node0.numa_other
62019 -1.0% 61380 proc-vmstat.nr_active_anon
62019 -1.0% 61380 proc-vmstat.nr_zone_active_anon
77639 -1.5% 76468 unixbench.score
96583240 -1.5% 95127283 unixbench.throughput
3.776e+10 -1.5% 3.718e+10 unixbench.workload
0.37 ± 2% +0.0 0.42 ± 2% perf-stat.i.branch-miss-rate%
7517646 +306.1% 30525773 perf-stat.i.branch-misses
0.02 +0.1 0.08 perf-stat.overall.branch-miss-rate%
7488773 +306.5% 30445089 perf-stat.ps.branch-misses
32.26 -0.3 32.00 perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.read
31.30 -0.3 31.04 perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.read
25.43 -0.3 25.18 perf-profile.calltrace.cycles-pp.vfs_read.ksys_read.do_syscall_64.entry_SYSCALL_64_after_hwframe.read
28.12 -0.2 27.89 perf-profile.calltrace.cycles-pp.ksys_read.do_syscall_64.entry_SYSCALL_64_after_hwframe.read
17.14 -0.2 16.93 perf-profile.calltrace.cycles-pp.pipe_read.vfs_read.ksys_read.do_syscall_64.entry_SYSCALL_64_after_hwframe
22.80 -0.1 22.66 perf-profile.calltrace.cycles-pp.pipe_write.vfs_write.ksys_write.do_syscall_64.entry_SYSCALL_64_after_hwframe
54.08 -0.1 53.94 perf-profile.calltrace.cycles-pp.write
7.76 -0.1 7.68 perf-profile.calltrace.cycles-pp.copy_page_from_iter.pipe_write.vfs_write.ksys_write.do_syscall_64
5.62 -0.1 5.54 perf-profile.calltrace.cycles-pp.entry_SYSCALL_64.write
6.50 -0.1 6.43 perf-profile.calltrace.cycles-pp._copy_from_iter.copy_page_from_iter.pipe_write.vfs_write.ksys_write
5.59 -0.1 5.52 perf-profile.calltrace.cycles-pp.entry_SYSCALL_64.read
6.66 -0.1 6.60 perf-profile.calltrace.cycles-pp.clear_bhb_loop.write
3.80 -0.1 3.74 perf-profile.calltrace.cycles-pp.touch_atime.pipe_read.vfs_read.ksys_read.do_syscall_64
3.14 -0.1 3.09 perf-profile.calltrace.cycles-pp.atime_needs_update.touch_atime.pipe_read.vfs_read.ksys_read
1.84 -0.0 1.79 perf-profile.calltrace.cycles-pp.mutex_lock.pipe_write.vfs_write.ksys_write.do_syscall_64
1.06 ± 2% -0.0 1.02 perf-profile.calltrace.cycles-pp.syscall_exit_to_user_mode.do_syscall_64.entry_SYSCALL_64_after_hwframe.write
1.84 -0.0 1.80 perf-profile.calltrace.cycles-pp.mutex_lock.pipe_read.vfs_read.ksys_read.do_syscall_64
0.68 -0.0 0.66 perf-profile.calltrace.cycles-pp.timestamp_truncate.atime_needs_update.touch_atime.pipe_read.vfs_read
1.11 -0.0 1.09 perf-profile.calltrace.cycles-pp.mutex_unlock.pipe_write.vfs_write.ksys_write.do_syscall_64
36.88 +0.1 36.96 perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.write
37.84 +0.1 37.92 perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.write
33.67 +0.1 33.79 perf-profile.calltrace.cycles-pp.ksys_write.do_syscall_64.entry_SYSCALL_64_after_hwframe.write
30.89 +0.1 31.01 perf-profile.calltrace.cycles-pp.vfs_write.ksys_write.do_syscall_64.entry_SYSCALL_64_after_hwframe.write
1.44 +0.1 1.58 perf-profile.calltrace.cycles-pp.main
6.70 +0.2 6.92 perf-profile.calltrace.cycles-pp.clear_bhb_loop.read
2.38 ± 2% +0.3 2.72 perf-profile.calltrace.cycles-pp.security_file_permission.rw_verify_area.vfs_write.ksys_write.do_syscall_64
2.89 +0.3 3.22 perf-profile.calltrace.cycles-pp.rw_verify_area.vfs_write.ksys_write.do_syscall_64.entry_SYSCALL_64_after_hwframe
1.78 ± 2% +0.3 2.12 ± 2% perf-profile.calltrace.cycles-pp.apparmor_file_permission.security_file_permission.rw_verify_area.vfs_write.ksys_write
4.10 ± 2% +0.4 4.52 perf-profile.calltrace.cycles-pp.inode_needs_update_time.file_update_time.pipe_write.vfs_write.ksys_write
25.66 -0.3 25.40 perf-profile.children.cycles-pp.vfs_read
28.35 -0.2 28.11 perf-profile.children.cycles-pp.ksys_read
17.82 -0.2 17.60 perf-profile.children.cycles-pp.pipe_read
68.72 -0.2 68.52 perf-profile.children.cycles-pp.do_syscall_64
70.48 -0.2 70.29 perf-profile.children.cycles-pp.entry_SYSCALL_64_after_hwframe
23.47 -0.1 23.32 perf-profile.children.cycles-pp.pipe_write
4.94 -0.1 4.84 perf-profile.children.cycles-pp.entry_SYSRETQ_unsafe_stack
54.28 -0.1 54.18 perf-profile.children.cycles-pp.write
8.06 -0.1 7.96 perf-profile.children.cycles-pp.copy_page_from_iter
3.98 -0.1 3.88 perf-profile.children.cycles-pp.mutex_lock
6.57 -0.1 6.48 perf-profile.children.cycles-pp.entry_SYSCALL_64
6.66 -0.1 6.58 perf-profile.children.cycles-pp._copy_from_iter
3.63 -0.1 3.57 perf-profile.children.cycles-pp.atime_needs_update
2.58 -0.1 2.52 perf-profile.children.cycles-pp.syscall_exit_to_user_mode
3.95 -0.1 3.89 perf-profile.children.cycles-pp.touch_atime
1.52 -0.0 1.48 perf-profile.children.cycles-pp.timestamp_truncate
31.15 +0.1 31.26 perf-profile.children.cycles-pp.vfs_write
33.92 +0.1 34.04 perf-profile.children.cycles-pp.ksys_write
1.85 +0.1 1.98 perf-profile.children.cycles-pp.main
13.50 +0.2 13.65 perf-profile.children.cycles-pp.clear_bhb_loop
4.88 ± 2% +0.2 5.12 perf-profile.children.cycles-pp.file_update_time
3.88 ± 2% +0.4 4.24 ± 2% perf-profile.children.cycles-pp.apparmor_file_permission
5.17 ± 2% +0.4 5.55 perf-profile.children.cycles-pp.security_file_permission
6.26 +0.4 6.65 perf-profile.children.cycles-pp.rw_verify_area
4.33 ± 2% +0.4 4.74 perf-profile.children.cycles-pp.inode_needs_update_time
4.34 ± 2% -0.2 4.16 perf-profile.self.cycles-pp.pipe_write
4.70 -0.1 4.58 perf-profile.self.cycles-pp.vfs_read
4.77 -0.1 4.66 perf-profile.self.cycles-pp.entry_SYSRETQ_unsafe_stack
0.54 ± 2% -0.1 0.44 perf-profile.self.cycles-pp.file_update_time
4.69 -0.1 4.61 perf-profile.self.cycles-pp.vfs_write
2.40 -0.1 2.33 perf-profile.self.cycles-pp.mutex_lock
6.42 -0.1 6.35 perf-profile.self.cycles-pp._copy_from_iter
3.17 -0.1 3.12 perf-profile.self.cycles-pp.pipe_read
1.42 -0.0 1.38 perf-profile.self.cycles-pp.syscall_exit_to_user_mode
1.26 -0.0 1.22 perf-profile.self.cycles-pp.timestamp_truncate
1.91 -0.0 1.88 perf-profile.self.cycles-pp.entry_SYSCALL_64
0.82 -0.0 0.80 perf-profile.self.cycles-pp.__cond_resched
1.02 -0.0 1.00 perf-profile.self.cycles-pp.x64_sys_call
1.06 -0.0 1.04 perf-profile.self.cycles-pp.aa_file_perm
1.47 -0.0 1.45 perf-profile.self.cycles-pp.copy_page_from_iter
0.99 -0.0 0.97 perf-profile.self.cycles-pp.copy_page_to_iter
0.16 ± 2% +0.0 0.18 ± 2% perf-profile.self.cycles-pp.__x64_sys_write
1.64 +0.1 1.75 perf-profile.self.cycles-pp.main
13.36 +0.1 13.50 perf-profile.self.cycles-pp.clear_bhb_loop
2.53 ± 4% +0.4 2.92 ± 3% perf-profile.self.cycles-pp.apparmor_file_permission
1.08 ± 2% +0.4 1.50 ± 2% perf-profile.self.cycles-pp.inode_needs_update_time




Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.


--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki