[linus:master] [fs] e747e15156: aim9.creat-clo.ops_per_sec 4.2% improvement

From: kernel test robot
Date: Mon Oct 14 2024 - 01:58:59 EST




Hello,

kernel test robot noticed a 4.2% improvement of aim9.creat-clo.ops_per_sec on:


commit: e747e15156b79efeea0ad056df8de14b93d318c2 ("fs: try an opportunistic lookup for O_CREAT opens too")
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master

testcase: aim9
config: x86_64-rhel-8.3
compiler: gcc-12
test machine: 48 threads 2 sockets Intel(R) Xeon(R) CPU E5-2697 v2 @ 2.70GHz (Ivy Bridge-EP) with 64G memory
parameters:

testtime: 300s
test: creat-clo
cpufreq_governor: performance






Details are as below:
-------------------------------------------------------------------------------------------------->


The kernel config and materials to reproduce are available at:
https://download.01.org/0day-ci/archive/20241014/202410141350.a747ff5e-oliver.sang@xxxxxxxxx

=========================================================================================
compiler/cpufreq_governor/kconfig/rootfs/tbox_group/test/testcase/testtime:
gcc-12/performance/x86_64-rhel-8.3/debian-12-x86_64-20240206.cgz/lkp-ivb-2ep2/creat-clo/aim9/300s

commit:
b9ca079dd6 ("eventpoll: Annotate data-race of busy_poll_usecs")
e747e15156 ("fs: try an opportunistic lookup for O_CREAT opens too")

b9ca079dd6b09e08 e747e15156b79efeea0ad056df8
---------------- ---------------------------
%stddev %change %stddev
\ | \
448590 +4.2% 467421 aim9.creat-clo.ops_per_sec
5868 ± 71% -99.7% 19.67 ± 79% proc-vmstat.numa_hint_faults
2929 ±112% -99.4% 17.33 ± 96% proc-vmstat.numa_pages_migrated
2929 ±112% -99.4% 17.33 ± 96% proc-vmstat.pgmigrate_success
0.04 ± 61% -100.0% 0.00 perf-sched.sch_delay.avg.ms.__cond_resched.mnt_want_write.open_last_lookups.path_openat.do_filp_open
0.09 ± 62% -100.0% 0.00 perf-sched.sch_delay.max.ms.__cond_resched.mnt_want_write.open_last_lookups.path_openat.do_filp_open
2.12 ± 44% +24071.1% 512.02 ±176% perf-sched.sch_delay.max.ms.worker_thread.kthread.ret_from_fork.ret_from_fork_asm
0.04 ± 61% -100.0% 0.00 perf-sched.wait_time.avg.ms.__cond_resched.mnt_want_write.open_last_lookups.path_openat.do_filp_open
0.09 ± 62% -100.0% 0.00 perf-sched.wait_time.max.ms.__cond_resched.mnt_want_write.open_last_lookups.path_openat.do_filp_open
7.648e+08 -2.8% 7.43e+08 perf-stat.i.branch-instructions
1.60 +0.1 1.69 perf-stat.i.branch-miss-rate%
1.14 +2.6% 1.17 perf-stat.i.cpi
3.776e+09 -1.9% 3.706e+09 perf-stat.i.instructions
0.89 -2.6% 0.87 perf-stat.i.ipc
2.00 +0.1 2.10 perf-stat.overall.branch-miss-rate%
1.11 +2.4% 1.14 perf-stat.overall.cpi
0.90 -2.4% 0.88 perf-stat.overall.ipc
7.623e+08 -2.8% 7.406e+08 perf-stat.ps.branch-instructions
3.763e+09 -1.8% 3.694e+09 perf-stat.ps.instructions
1.135e+12 -1.9% 1.113e+12 perf-stat.total.instructions
2.34 ± 5% -1.7 0.69 ± 8% perf-profile.calltrace.cycles-pp.open_last_lookups.path_openat.do_filp_open.do_sys_openat2.__x64_sys_creat
23.22 -1.1 22.16 perf-profile.calltrace.cycles-pp.do_sys_openat2.__x64_sys_creat.do_syscall_64.entry_SYSCALL_64_after_hwframe.creat64
23.56 -1.1 22.49 perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.creat64
23.68 -1.1 22.62 perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.creat64
23.27 -1.1 22.21 perf-profile.calltrace.cycles-pp.__x64_sys_creat.do_syscall_64.entry_SYSCALL_64_after_hwframe.creat64
18.68 -0.8 17.84 perf-profile.calltrace.cycles-pp.path_openat.do_filp_open.do_sys_openat2.__x64_sys_creat.do_syscall_64
19.05 -0.8 18.26 perf-profile.calltrace.cycles-pp.do_filp_open.do_sys_openat2.__x64_sys_creat.do_syscall_64.entry_SYSCALL_64_after_hwframe
29.56 -0.7 28.81 perf-profile.calltrace.cycles-pp.creat64
0.86 ± 3% +0.0 0.90 ± 2% perf-profile.calltrace.cycles-pp.security_file_alloc.init_file.alloc_empty_file.path_openat.do_filp_open
1.29 +0.1 1.38 ± 4% perf-profile.calltrace.cycles-pp.entry_SYSCALL_64.creat64
1.01 ± 4% +0.1 1.12 ± 5% perf-profile.calltrace.cycles-pp.ima_file_check.security_file_post_open.do_open.path_openat.do_filp_open
1.07 ± 5% +0.1 1.18 ± 5% perf-profile.calltrace.cycles-pp.security_file_post_open.do_open.path_openat.do_filp_open.do_sys_openat2
1.53 ± 3% +0.1 1.65 ± 3% perf-profile.calltrace.cycles-pp.cap_inode_need_killpriv.security_inode_need_killpriv.dentry_needs_remove_privs.do_truncate.do_open
1.65 ± 3% +0.1 1.78 ± 3% perf-profile.calltrace.cycles-pp.security_inode_need_killpriv.dentry_needs_remove_privs.do_truncate.do_open.path_openat
0.71 ± 6% +0.1 0.84 ± 13% perf-profile.calltrace.cycles-pp.kmem_cache_free.__x64_sys_close.do_syscall_64.entry_SYSCALL_64_after_hwframe.__close
1.72 ± 3% +0.1 1.86 ± 2% perf-profile.calltrace.cycles-pp.dentry_needs_remove_privs.do_truncate.do_open.path_openat.do_filp_open
1.32 ± 3% +0.1 1.46 ± 4% perf-profile.calltrace.cycles-pp.__vfs_getxattr.cap_inode_need_killpriv.security_inode_need_killpriv.dentry_needs_remove_privs.do_truncate
2.57 ± 6% +0.2 2.82 ± 4% perf-profile.calltrace.cycles-pp.alloc_empty_file.path_openat.do_filp_open.do_sys_openat2.__x64_sys_creat
1.32 ± 14% +0.2 1.57 ± 8% perf-profile.calltrace.cycles-pp.kmem_cache_alloc_noprof.alloc_empty_file.path_openat.do_filp_open.do_sys_openat2
0.74 ± 23% +0.3 1.02 ± 16% perf-profile.calltrace.cycles-pp.__memcg_slab_post_alloc_hook.kmem_cache_alloc_noprof.alloc_empty_file.path_openat.do_filp_open
11.00 +0.7 11.66 perf-profile.calltrace.cycles-pp.__close
10.48 +0.7 11.19 perf-profile.calltrace.cycles-pp.do_open.path_openat.do_filp_open.do_sys_openat2.__x64_sys_creat
2.39 ± 6% -1.7 0.72 ± 7% perf-profile.children.cycles-pp.open_last_lookups
23.33 -1.1 22.26 perf-profile.children.cycles-pp.do_sys_openat2
23.28 -1.1 22.22 perf-profile.children.cycles-pp.__x64_sys_creat
18.79 -0.8 17.95 perf-profile.children.cycles-pp.path_openat
19.13 -0.8 18.34 perf-profile.children.cycles-pp.do_filp_open
29.79 -0.8 29.04 perf-profile.children.cycles-pp.creat64
29.48 -0.8 28.72 perf-profile.children.cycles-pp.do_syscall_64
29.68 -0.7 28.97 perf-profile.children.cycles-pp.entry_SYSCALL_64_after_hwframe
0.48 ± 5% -0.4 0.05 ± 48% perf-profile.children.cycles-pp.lookup_open
0.92 ± 9% -0.4 0.50 ± 7% perf-profile.children.cycles-pp.try_to_unlazy
0.78 ± 10% -0.4 0.40 ± 7% perf-profile.children.cycles-pp.dput
0.79 ± 7% -0.4 0.42 ± 7% perf-profile.children.cycles-pp.__legitimize_path
0.53 ± 13% -0.3 0.26 ± 10% perf-profile.children.cycles-pp.lockref_put_return
0.52 ± 10% -0.2 0.32 ± 14% perf-profile.children.cycles-pp.terminate_walk
0.39 ± 6% -0.2 0.20 ± 8% perf-profile.children.cycles-pp.__legitimize_mnt
3.25 ± 2% -0.2 3.07 ± 2% perf-profile.children.cycles-pp.notify_change
0.76 ± 6% -0.2 0.58 ± 9% perf-profile.children.cycles-pp._raw_spin_lock
0.45 ± 7% -0.2 0.28 ± 14% perf-profile.children.cycles-pp.mnt_want_write
0.52 ± 6% -0.2 0.35 ± 7% perf-profile.children.cycles-pp.security_inode_setattr
0.33 ± 6% -0.2 0.17 ± 15% perf-profile.children.cycles-pp.lockref_get_not_dead
0.32 ± 3% -0.2 0.17 ± 12% perf-profile.children.cycles-pp.down_write
0.60 ± 15% -0.2 0.45 ± 22% perf-profile.children.cycles-pp.step_into
0.31 ± 6% -0.1 0.16 ± 9% perf-profile.children.cycles-pp.up_write
0.45 ± 7% -0.1 0.31 ± 4% perf-profile.children.cycles-pp.mnt_get_write_access
0.48 ± 6% -0.1 0.36 ± 8% perf-profile.children.cycles-pp.__cond_resched
0.35 ± 8% -0.1 0.27 ± 5% perf-profile.children.cycles-pp.evm_inode_setattr
0.42 ± 9% -0.1 0.34 ± 13% perf-profile.children.cycles-pp.generic_permission
0.15 ± 12% -0.1 0.09 ± 12% perf-profile.children.cycles-pp.getname
0.20 ± 6% -0.1 0.14 ± 13% perf-profile.children.cycles-pp.rcu_all_qs
0.13 ± 13% -0.0 0.10 ± 17% perf-profile.children.cycles-pp.mntput_no_expire
0.09 ± 6% -0.0 0.07 ± 11% perf-profile.children.cycles-pp.can_stop_idle_tick
0.06 ± 47% +0.0 0.09 ± 10% perf-profile.children.cycles-pp.inode_newsize_ok
0.87 ± 2% +0.1 0.92 perf-profile.children.cycles-pp.security_file_alloc
0.03 ±100% +0.1 0.08 ± 11% perf-profile.children.cycles-pp.pm_qos_read_value
0.53 ± 5% +0.1 0.59 ± 3% perf-profile.children.cycles-pp.entry_SYSCALL_64_safe_stack
0.19 ± 11% +0.1 0.27 ± 8% perf-profile.children.cycles-pp.setattr_prepare
0.34 ± 10% +0.1 0.42 ± 3% perf-profile.children.cycles-pp.simple_xattr_get
1.08 ± 5% +0.1 1.19 ± 5% perf-profile.children.cycles-pp.security_file_post_open
1.02 ± 5% +0.1 1.13 ± 5% perf-profile.children.cycles-pp.ima_file_check
1.55 ± 3% +0.1 1.67 ± 3% perf-profile.children.cycles-pp.cap_inode_need_killpriv
0.33 ± 11% +0.1 0.46 ± 24% perf-profile.children.cycles-pp.apparmor_file_open
1.67 ± 3% +0.1 1.80 ± 3% perf-profile.children.cycles-pp.security_inode_need_killpriv
0.37 ± 10% +0.1 0.50 ± 20% perf-profile.children.cycles-pp.security_file_open
0.35 ± 7% +0.1 0.50 ± 20% perf-profile.children.cycles-pp.security_current_getsecid_subj
1.74 ± 3% +0.1 1.88 ± 2% perf-profile.children.cycles-pp.dentry_needs_remove_privs
1.34 ± 3% +0.2 1.50 ± 4% perf-profile.children.cycles-pp.__vfs_getxattr
2.98 +0.2 3.17 perf-profile.children.cycles-pp.entry_SYSCALL_64
0.57 ± 7% +0.4 0.98 ± 6% perf-profile.children.cycles-pp.__d_lookup_rcu
0.63 ± 6% +0.4 1.08 ± 5% perf-profile.children.cycles-pp.lookup_fast
0.00 +0.5 0.54 ± 6% perf-profile.children.cycles-pp.complete_walk
11.22 +0.7 11.89 perf-profile.children.cycles-pp.__close
10.53 +0.7 11.23 perf-profile.children.cycles-pp.do_open
0.52 ± 12% -0.3 0.26 ± 10% perf-profile.self.cycles-pp.lockref_put_return
0.73 ± 7% -0.2 0.56 ± 10% perf-profile.self.cycles-pp._raw_spin_lock
0.38 ± 7% -0.2 0.20 ± 8% perf-profile.self.cycles-pp.__legitimize_mnt
0.33 ± 7% -0.2 0.17 ± 15% perf-profile.self.cycles-pp.lockref_get_not_dead
0.30 ± 6% -0.1 0.16 ± 9% perf-profile.self.cycles-pp.up_write
0.44 ± 7% -0.1 0.30 ± 3% perf-profile.self.cycles-pp.mnt_get_write_access
0.72 ± 6% -0.1 0.60 ± 8% perf-profile.self.cycles-pp.do_dentry_open
0.24 ± 5% -0.1 0.12 ± 11% perf-profile.self.cycles-pp.down_write
0.35 ± 11% -0.1 0.27 ± 13% perf-profile.self.cycles-pp.generic_permission
0.20 ± 21% -0.1 0.11 ± 9% perf-profile.self.cycles-pp.open_last_lookups
0.16 ± 9% -0.1 0.08 ± 19% perf-profile.self.cycles-pp.security_inode_setattr
0.16 ± 13% -0.1 0.09 ± 6% perf-profile.self.cycles-pp.getname_flags
0.27 ± 8% -0.1 0.20 ± 6% perf-profile.self.cycles-pp.evm_inode_setattr
0.14 ± 14% -0.1 0.08 ± 18% perf-profile.self.cycles-pp.getname
0.32 ± 2% -0.1 0.26 ± 9% perf-profile.self.cycles-pp.common_perm_cond
0.25 ± 5% -0.1 0.20 ± 8% perf-profile.self.cycles-pp.__cond_resched
0.17 ± 11% -0.1 0.12 ± 14% perf-profile.self.cycles-pp.rcu_all_qs
0.25 ± 9% -0.0 0.20 ± 10% perf-profile.self.cycles-pp.alloc_fd
0.12 ± 7% -0.0 0.08 ± 45% perf-profile.self.cycles-pp.shmem_file_open
0.13 ± 13% -0.0 0.10 ± 17% perf-profile.self.cycles-pp.mntput_no_expire
0.09 ± 6% -0.0 0.07 ± 11% perf-profile.self.cycles-pp.can_stop_idle_tick
0.11 ± 16% +0.0 0.15 ± 12% perf-profile.self.cycles-pp.lockref_get
0.03 ±100% +0.0 0.08 ± 14% perf-profile.self.cycles-pp.pm_qos_read_value
0.04 ± 72% +0.0 0.08 ± 8% perf-profile.self.cycles-pp.inode_newsize_ok
0.12 ± 17% +0.0 0.17 ± 9% perf-profile.self.cycles-pp.setattr_prepare
0.03 ±100% +0.1 0.09 ± 15% perf-profile.self.cycles-pp.lookup_fast
0.17 ± 13% +0.1 0.25 ± 9% perf-profile.self.cycles-pp.simple_xattr_get
0.26 ± 9% +0.1 0.40 ± 25% perf-profile.self.cycles-pp.apparmor_current_getsecid_subj
2.62 +0.2 2.81 perf-profile.self.cycles-pp.entry_SYSCALL_64
0.41 ± 15% +0.3 0.66 ± 15% perf-profile.self.cycles-pp.__memcg_slab_post_alloc_hook
0.57 ± 6% +0.4 0.97 ± 6% perf-profile.self.cycles-pp.__d_lookup_rcu




Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.


--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki