[linus:master] [net] e9669a00bb: aim9.udp_test.ops_per_sec 2.7% improvement

From: kernel test robot
Date: Tue May 28 2024 - 22:36:01 EST




Hello,

kernel test robot noticed a 2.7% improvement of aim9.udp_test.ops_per_sec on:


commit: e9669a00bba79442dd4862c57761333d6a020c24 ("net: udp: add IP/port data to the tracepoint udp/udp_fail_queue_rcv_skb")
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master

testcase: aim9
test machine: 48 threads 2 sockets Intel(R) Xeon(R) CPU E5-2697 v2 @ 2.70GHz (Ivy Bridge-EP) with 64G memory
parameters:

testtime: 300s
test: udp_test
cpufreq_governor: performance






Details are as below:
-------------------------------------------------------------------------------------------------->


The kernel config and materials to reproduce are available at:
https://download.01.org/0day-ci/archive/20240529/202405291024.412dc03e-oliver.sang@xxxxxxxxx

=========================================================================================
compiler/cpufreq_governor/kconfig/rootfs/tbox_group/test/testcase/testtime:
gcc-13/performance/x86_64-rhel-8.3/debian-12-x86_64-20240206.cgz/lkp-ivb-2ep2/udp_test/aim9/300s

commit:
a0ad11fc26 ("net: port TP_STORE_ADDR_PORTS_SKB macro to be tcp/udp independent")
e9669a00bb ("net: udp: add IP/port data to the tracepoint udp/udp_fail_queue_rcv_skb")

a0ad11fc2632903e e9669a00bba79442dd4862c5776
---------------- ---------------------------
%stddev %change %stddev
\ | \
294621 +2.7% 302449 aim9.udp_test.ops_per_sec
20613 +1.7% 20955 proc-vmstat.nr_slab_reclaimable
5.444e+08 +2.1% 5.558e+08 perf-stat.i.branch-instructions
8460968 ± 2% +4.8% 8867626 perf-stat.i.cache-references
1.58 -2.3% 1.54 perf-stat.i.cpi
66.45 +4.0% 69.07 perf-stat.i.cpu-migrations
4858 ± 5% -7.6% 4487 ± 3% perf-stat.i.cycles-between-cache-misses
2.846e+09 +2.1% 2.906e+09 perf-stat.i.instructions
0.65 +2.2% 0.67 perf-stat.i.ipc
1.48 -1.9% 1.45 perf-stat.overall.cpi
3684 ± 3% -5.1% 3495 ± 3% perf-stat.overall.cycles-between-cache-misses
0.68 +1.9% 0.69 perf-stat.overall.ipc
5.428e+08 +2.1% 5.541e+08 perf-stat.ps.branch-instructions
8432000 ± 2% +4.8% 8837232 perf-stat.ps.cache-references
66.22 +4.0% 68.84 perf-stat.ps.cpu-migrations
2.837e+09 +2.1% 2.897e+09 perf-stat.ps.instructions
8.552e+11 +2.0% 8.726e+11 perf-stat.total.instructions
21.69 -0.7 21.03 perf-profile.calltrace.cycles-pp.ksys_write.do_syscall_64.entry_SYSCALL_64_after_hwframe.write
22.32 -0.7 21.66 perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.write
22.29 -0.7 21.63 perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.write
21.21 -0.6 20.60 perf-profile.calltrace.cycles-pp.vfs_write.ksys_write.do_syscall_64.entry_SYSCALL_64_after_hwframe.write
19.97 -0.5 19.47 perf-profile.calltrace.cycles-pp.sock_write_iter.vfs_write.ksys_write.do_syscall_64.entry_SYSCALL_64_after_hwframe
26.87 -0.5 26.38 perf-profile.calltrace.cycles-pp.write
18.78 -0.4 18.32 perf-profile.calltrace.cycles-pp.udp_sendmsg.sock_write_iter.vfs_write.ksys_write.do_syscall_64
1.54 ± 4% -0.3 1.24 ± 4% perf-profile.calltrace.cycles-pp.loopback_xmit.dev_hard_start_xmit.__dev_queue_xmit.ip_finish_output2.ip_send_skb
1.67 ± 4% -0.3 1.40 ± 3% perf-profile.calltrace.cycles-pp.dev_hard_start_xmit.__dev_queue_xmit.ip_finish_output2.ip_send_skb.udp_send_skb
0.92 ± 6% -0.1 0.83 ± 5% perf-profile.calltrace.cycles-pp.__ip_make_skb.ip_make_skb.udp_sendmsg.sock_write_iter.vfs_write
3.21 +0.1 3.32 ± 2% perf-profile.calltrace.cycles-pp.ip_protocol_deliver_rcu.ip_local_deliver_finish.__netif_receive_skb_one_core.process_backlog.__napi_poll
0.76 ± 6% +0.1 0.88 ± 5% perf-profile.calltrace.cycles-pp.ip_rcv.__netif_receive_skb_one_core.process_backlog.__napi_poll.net_rx_action
0.92 ± 5% +0.1 1.05 ± 4% perf-profile.calltrace.cycles-pp.__skb_recv_udp.udp_recvmsg.inet_recvmsg.sock_recvmsg.sock_read_iter
2.76 +0.1 2.89 ± 2% perf-profile.calltrace.cycles-pp.__udp4_lib_rcv.ip_protocol_deliver_rcu.ip_local_deliver_finish.__netif_receive_skb_one_core.process_backlog
0.43 ± 50% +0.2 0.58 ± 6% perf-profile.calltrace.cycles-pp.irqtime_account_irq.__do_softirq.do_softirq.__local_bh_enable_ip.__dev_queue_xmit
5.96 +0.2 6.21 perf-profile.calltrace.cycles-pp.net_rx_action.__do_softirq.do_softirq.__local_bh_enable_ip.__dev_queue_xmit
4.81 +0.3 5.07 ± 2% perf-profile.calltrace.cycles-pp.sock_read_iter.vfs_read.ksys_read.do_syscall_64.entry_SYSCALL_64_after_hwframe
4.93 +0.3 5.20 perf-profile.calltrace.cycles-pp.__netif_receive_skb_one_core.process_backlog.__napi_poll.net_rx_action.__do_softirq
6.92 +0.3 7.19 perf-profile.calltrace.cycles-pp.do_softirq.__local_bh_enable_ip.__dev_queue_xmit.ip_finish_output2.ip_send_skb
6.27 +0.3 6.55 ± 2% perf-profile.calltrace.cycles-pp.ksys_read.do_syscall_64.entry_SYSCALL_64_after_hwframe.read._IO_padn
6.81 +0.3 7.10 perf-profile.calltrace.cycles-pp.__do_softirq.do_softirq.__local_bh_enable_ip.__dev_queue_xmit.ip_finish_output2
5.80 +0.3 6.12 ± 2% perf-profile.calltrace.cycles-pp.vfs_read.ksys_read.do_syscall_64.entry_SYSCALL_64_after_hwframe.read
5.53 +0.3 5.86 perf-profile.calltrace.cycles-pp.__napi_poll.net_rx_action.__do_softirq.do_softirq.__local_bh_enable_ip
5.44 +0.3 5.77 perf-profile.calltrace.cycles-pp.process_backlog.__napi_poll.net_rx_action.__do_softirq.do_softirq
11.02 +0.4 11.45 perf-profile.calltrace.cycles-pp.read._IO_padn
11.08 +0.4 11.53 perf-profile.calltrace.cycles-pp._IO_padn
21.71 -0.7 21.04 perf-profile.children.cycles-pp.ksys_write
21.23 -0.6 20.62 perf-profile.children.cycles-pp.vfs_write
19.98 -0.5 19.47 perf-profile.children.cycles-pp.sock_write_iter
18.81 -0.4 18.37 perf-profile.children.cycles-pp.udp_sendmsg
1.57 ± 4% -0.3 1.28 ± 4% perf-profile.children.cycles-pp.loopback_xmit
1.67 ± 4% -0.3 1.40 ± 3% perf-profile.children.cycles-pp.dev_hard_start_xmit
0.60 ± 10% -0.1 0.47 ± 10% perf-profile.children.cycles-pp.__netif_rx
0.56 ± 9% -0.1 0.45 ± 10% perf-profile.children.cycles-pp.netif_rx_internal
0.49 ± 10% -0.1 0.39 ± 8% perf-profile.children.cycles-pp.enqueue_to_backlog
0.35 ± 7% -0.1 0.26 ± 8% perf-profile.children.cycles-pp.sock_wfree
0.37 ± 11% -0.1 0.28 ± 9% perf-profile.children.cycles-pp.ip_setup_cork
0.17 ± 27% -0.1 0.10 ± 15% perf-profile.children.cycles-pp.__errno_location
0.12 ± 17% -0.0 0.08 ± 16% perf-profile.children.cycles-pp.__errno_location@plt
0.16 ± 10% -0.0 0.12 ± 8% perf-profile.children.cycles-pp._raw_spin_unlock_irqrestore
0.09 ± 18% +0.1 0.14 ± 12% perf-profile.children.cycles-pp.validate_xmit_xfrm
0.30 ± 9% +0.1 0.36 ± 7% perf-profile.children.cycles-pp.ip_rcv_core
0.95 ± 5% +0.1 1.06 ± 4% perf-profile.children.cycles-pp.__skb_recv_udp
0.77 ± 6% +0.1 0.90 ± 5% perf-profile.children.cycles-pp.ip_rcv
2.79 +0.1 2.93 ± 2% perf-profile.children.cycles-pp.__udp4_lib_rcv
6.02 +0.2 6.26 perf-profile.children.cycles-pp.net_rx_action
4.82 +0.3 5.08 ± 2% perf-profile.children.cycles-pp.sock_read_iter
4.94 +0.3 5.21 perf-profile.children.cycles-pp.__netif_receive_skb_one_core
6.93 +0.3 7.20 perf-profile.children.cycles-pp.do_softirq
6.36 +0.3 6.63 perf-profile.children.cycles-pp.ksys_read
5.87 +0.3 6.19 ± 2% perf-profile.children.cycles-pp.vfs_read
5.46 +0.3 5.78 perf-profile.children.cycles-pp.process_backlog
5.56 +0.3 5.88 perf-profile.children.cycles-pp.__napi_poll
8.55 ± 2% +0.4 8.93 ± 2% perf-profile.children.cycles-pp.__do_softirq
11.27 +0.4 11.70 perf-profile.children.cycles-pp.read
11.08 +0.4 11.53 perf-profile.children.cycles-pp._IO_padn
0.32 ± 8% -0.1 0.25 ± 9% perf-profile.self.cycles-pp.sock_wfree
0.22 ± 13% -0.1 0.16 ± 13% perf-profile.self.cycles-pp.ip_setup_cork
0.16 ± 18% -0.1 0.10 ± 18% perf-profile.self.cycles-pp.__ip_local_out
0.41 ± 6% -0.1 0.36 ± 7% perf-profile.self.cycles-pp.net_rx_action
0.11 ± 17% -0.0 0.07 ± 14% perf-profile.self.cycles-pp.__errno_location@plt
0.15 ± 10% -0.0 0.12 ± 9% perf-profile.self.cycles-pp._raw_spin_unlock_irqrestore
0.04 ± 67% +0.0 0.08 ± 23% perf-profile.self.cycles-pp.udp_queue_rcv_skb
0.20 ± 7% +0.0 0.25 ± 11% perf-profile.self.cycles-pp.__udp_enqueue_schedule_skb
0.09 ± 18% +0.1 0.14 ± 13% perf-profile.self.cycles-pp.validate_xmit_xfrm
0.06 ± 19% +0.1 0.12 ± 9% perf-profile.self.cycles-pp.security_socket_recvmsg
0.14 ± 11% +0.1 0.20 ± 14% perf-profile.self.cycles-pp.ip_generic_getfrag
0.13 ± 19% +0.1 0.21 ± 12% perf-profile.self.cycles-pp.inet_recvmsg
0.27 ± 9% +0.1 0.35 ± 8% perf-profile.self.cycles-pp.__udp4_lib_rcv




Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.


--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki