[linus:master] [net] 2b0cfa6e49: stress-ng.sockfd.ops_per_sec 6.9% improvement

From: kernel test robot
Date: Wed Apr 03 2024 - 03:54:29 EST




Hello,

kernel test robot noticed a 6.9% improvement of stress-ng.sockfd.ops_per_sec on:


commit: 2b0cfa6e49566c8fa6759734cf821aa6e8271a9e ("net: add generic percpu page_pool allocator")
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master

testcase: stress-ng
test machine: 224 threads 2 sockets Intel(R) Xeon(R) Platinum 8480CTDX (Sapphire Rapids) with 256G memory
parameters:

nr_threads: 100%
testtime: 60s
test: sockfd
cpufreq_governor: performance






Details are as below:
-------------------------------------------------------------------------------------------------->


The kernel config and materials to reproduce are available at:
https://download.01.org/0day-ci/archive/20240403/202404031530.bbb648b3-oliver.sang@xxxxxxxxx

=========================================================================================
compiler/cpufreq_governor/kconfig/nr_threads/rootfs/tbox_group/test/testcase/testtime:
gcc-12/performance/x86_64-rhel-8.3/100%/debian-12-x86_64-20240206.cgz/lkp-spr-r02/sockfd/stress-ng/60s

commit:
32e4a5447e ("net: dsa: realtek: fix digital interface select macro for EXT0")
2b0cfa6e49 ("net: add generic percpu page_pool allocator")

32e4a5447ed9fa90 2b0cfa6e49566c8fa6759734cf8
---------------- ---------------------------
%stddev %change %stddev
\ | \
64663 +3.5% 66953 vmstat.system.cs
54271775 +6.9% 58029628 stress-ng.sockfd.ops
904069 +6.9% 966535 stress-ng.sockfd.ops_per_sec
2226722 -2.4% 2174212 stress-ng.time.involuntary_context_switches
1678568 ± 2% +11.7% 1874352 stress-ng.time.voluntary_context_switches
48.68 +0.1 48.79 perf-profile.calltrace.cycles-pp.unix_scm_to_skb.unix_stream_sendmsg.____sys_sendmsg.___sys_sendmsg.__sys_sendmsg
48.54 +0.1 48.66 perf-profile.calltrace.cycles-pp._raw_spin_lock.unix_inflight.unix_scm_to_skb.unix_stream_sendmsg.____sys_sendmsg
48.56 +0.1 48.67 perf-profile.calltrace.cycles-pp.unix_inflight.unix_scm_to_skb.unix_stream_sendmsg.____sys_sendmsg.___sys_sendmsg
48.46 +0.1 48.58 perf-profile.calltrace.cycles-pp.native_queued_spin_lock_slowpath._raw_spin_lock.unix_inflight.unix_scm_to_skb.unix_stream_sendmsg
0.25 ± 3% +0.0 0.28 ± 3% perf-profile.children.cycles-pp.syscall_exit_to_user_mode
0.15 ± 4% +0.0 0.18 ± 5% perf-profile.children.cycles-pp.__fput
0.21 ± 3% +0.0 0.24 ± 3% perf-profile.children.cycles-pp.task_work_run
0.22 ± 4% +0.0 0.25 ± 4% perf-profile.children.cycles-pp.syscall
48.68 +0.1 48.79 perf-profile.children.cycles-pp.unix_scm_to_skb
48.56 +0.1 48.67 perf-profile.children.cycles-pp.unix_inflight
0.33 -0.0 0.30 perf-profile.self.cycles-pp._raw_spin_lock
0.09 ± 4% +0.0 0.11 ± 6% perf-profile.self.cycles-pp.lockref_put_return
0.46 +4.5% 0.48 ± 2% perf-stat.i.MPKI
28697474 +6.5% 30562835 perf-stat.i.cache-misses
96563339 +5.3% 1.017e+08 perf-stat.i.cache-references
66959 +3.8% 69506 perf-stat.i.context-switches
9.73 -1.3% 9.60 perf-stat.i.cpi
23006 -6.2% 21577 perf-stat.i.cycles-between-cache-misses
6.536e+10 +1.6% 6.643e+10 perf-stat.i.instructions
0.43 +4.9% 0.46 perf-stat.overall.MPKI
9.87 -1.4% 9.73 perf-stat.overall.cpi
22695 -6.0% 21332 perf-stat.overall.cycles-between-cache-misses
0.10 +1.4% 0.10 perf-stat.overall.ipc
27236190 +6.2% 28914374 perf-stat.ps.cache-misses
93594596 +4.9% 98187722 perf-stat.ps.cache-references
64403 +3.7% 66793 perf-stat.ps.context-switches




Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.


--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki