[linus:master] [x86/acpi] ec9aedb2aa: aim9.exec_test.ops_per_sec 2.4% improvement
From: kernel test robot
Date: Thu Nov 30 2023 - 00:50:30 EST
Hello,
kernel test robot noticed a 2.4% improvement of aim9.exec_test.ops_per_sec on:
commit: ec9aedb2aa1ab7ac420c00b31f5edc5be15ec167 ("x86/acpi: Ignore invalid x2APIC entries")
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master
testcase: aim9
test machine: 48 threads 2 sockets Intel(R) Xeon(R) CPU E5-2697 v2 @ 2.70GHz (Ivy Bridge-EP) with 112G memory
parameters:
testtime: 300s
test: exec_test
cpufreq_governor: performance
besides below detailed comparison, we also noticed some difference from dmesg.
for this commit ec9aedb2aa:
[ 1.311075][ T0] smpboot: Allowing 48 CPUs, 0 hotplug CPUs
for parent:
[ 1.311098][ T0] smpboot: Allowing 168 CPUs, 120 hotplug CPUs
Details are as below:
-------------------------------------------------------------------------------------------------->
The kernel config and materials to reproduce are available at:
https://download.01.org/0day-ci/archive/20231130/202311301346.56b0fcd6-oliver.sang@xxxxxxxxx
=========================================================================================
compiler/cpufreq_governor/kconfig/rootfs/tbox_group/test/testcase/testtime:
gcc-12/performance/x86_64-rhel-8.3/debian-11.1-x86_64-20220510.cgz/lkp-ivb-2ep1/exec_test/aim9/300s
commit:
31255e072b ("x86/shstk: Delay signal entry SSP write until after user accesses")
ec9aedb2aa ("x86/acpi: Ignore invalid x2APIC entries")
31255e072b2e91f9 ec9aedb2aa1ab7ac420c00b31f5
---------------- ---------------------------
%stddev %change %stddev
\ | \
8587 ± 3% +5.9% 9091 vmstat.system.cs
6542 ± 9% -18.2% 5352 ± 7% numa-meminfo.node1.KernelStack
57960 ± 4% -12.6% 50656 ± 6% numa-meminfo.node1.SUnreclaim
6541 ± 9% -18.0% 5363 ± 6% numa-vmstat.node1.nr_kernel_stack
14490 ± 4% -12.6% 12663 ± 6% numa-vmstat.node1.nr_slab_unreclaimable
179678 ± 7% -22.6% 139060 ± 10% meminfo.DirectMap4k
13670 -13.6% 11809 meminfo.KernelStack
78243 -72.5% 21498 meminfo.Percpu
1222 +2.4% 1251 aim9.exec_test.ops_per_sec
27978802 +3.1% 28859909 aim9.time.minor_page_faults
175.04 -6.2% 164.11 aim9.time.system_time
115.72 +9.1% 126.24 aim9.time.user_time
731948 +2.4% 749684 aim9.time.voluntary_context_switches
13669 -13.8% 11788 proc-vmstat.nr_kernel_stack
21028 -3.2% 20355 proc-vmstat.nr_slab_reclaimable
29074 -9.0% 26443 proc-vmstat.nr_slab_unreclaimable
50357 -1.3% 49699 proc-vmstat.numa_other
28937047 +3.0% 29790891 proc-vmstat.pgfault
0.55 ± 5% +0.1 0.65 ± 7% perf-profile.calltrace.cycles-pp.next_uptodate_folio.filemap_map_pages.do_read_fault.do_fault.__handle_mm_fault
1.38 ± 6% -0.7 0.67 ± 9% perf-profile.children.cycles-pp.mm_init
0.87 ± 7% -0.5 0.38 ± 10% perf-profile.children.cycles-pp.pcpu_alloc
0.76 ± 8% -0.3 0.42 ± 8% perf-profile.children.cycles-pp.alloc_bprm
0.50 ± 6% -0.3 0.17 ± 6% perf-profile.children.cycles-pp.memset_orig
0.40 ± 5% -0.2 0.15 ± 18% perf-profile.children.cycles-pp.__percpu_counter_init_many
0.15 ± 20% -0.1 0.03 ±101% perf-profile.children.cycles-pp.mm_init_cid
0.23 ± 14% -0.1 0.12 ± 19% perf-profile.children.cycles-pp._find_next_bit
0.30 ± 10% -0.1 0.24 ± 16% perf-profile.children.cycles-pp.mas_preallocate
0.14 ± 18% -0.0 0.09 ± 16% perf-profile.children.cycles-pp.pm_qos_read_value
0.09 ± 15% -0.0 0.07 ± 10% perf-profile.children.cycles-pp.remove_vma
0.05 ± 47% +0.1 0.11 ± 26% perf-profile.children.cycles-pp.malloc
0.20 ± 22% +0.1 0.25 ± 7% perf-profile.children.cycles-pp.do_brk_flags
0.44 ± 5% +0.1 0.53 ± 8% perf-profile.children.cycles-pp.mod_objcg_state
0.80 ± 4% +0.2 0.96 ± 6% perf-profile.children.cycles-pp.next_uptodate_folio
0.50 ± 7% -0.3 0.17 ± 6% perf-profile.self.cycles-pp.memset_orig
0.26 ± 16% -0.2 0.04 ±106% perf-profile.self.cycles-pp.mm_init
0.14 ± 25% -0.1 0.03 ±100% perf-profile.self.cycles-pp.mm_init_cid
0.18 ± 22% -0.1 0.08 ± 34% perf-profile.self.cycles-pp.pcpu_alloc
0.13 ± 16% -0.0 0.08 ± 20% perf-profile.self.cycles-pp.pm_qos_read_value
0.37 ± 6% +0.1 0.45 ± 10% perf-profile.self.cycles-pp.mod_objcg_state
0.66 ± 5% +0.1 0.80 ± 6% perf-profile.self.cycles-pp.next_uptodate_folio
34087721 ± 2% +3.6% 35301961 perf-stat.i.branch-misses
8601 ± 3% +6.1% 9122 perf-stat.i.context-switches
72.92 ± 2% +7.4% 78.30 ± 3% perf-stat.i.cpu-migrations
1.55 ± 2% -0.1 1.42 ± 3% perf-stat.i.dTLB-load-miss-rate%
0.51 ± 2% -0.2 0.32 perf-stat.i.dTLB-store-miss-rate%
2867856 ± 3% -36.9% 1810983 perf-stat.i.dTLB-store-misses
5.561e+08 ± 2% +3.0% 5.73e+08 perf-stat.i.dTLB-stores
92019 ± 4% +10.2% 101371 perf-stat.i.iTLB-loads
126.43 ± 15% -33.8% 83.76 perf-stat.i.metric.K/sec
90050 ± 4% +6.8% 96193 perf-stat.i.minor-faults
19.22 ± 4% -1.5 17.77 ± 3% perf-stat.i.node-store-miss-rate%
90050 ± 4% +6.8% 96194 perf-stat.i.page-faults
1.48 ± 2% -0.1 1.38 ± 3% perf-stat.overall.dTLB-load-miss-rate%
0.51 -0.2 0.32 perf-stat.overall.dTLB-store-miss-rate%
33982829 ± 2% +3.5% 35183134 perf-stat.ps.branch-misses
8573 ± 3% +6.0% 9090 perf-stat.ps.context-switches
72.73 ± 2% +7.4% 78.13 ± 3% perf-stat.ps.cpu-migrations
2858954 ± 3% -36.9% 1805251 perf-stat.ps.dTLB-store-misses
5.545e+08 ± 2% +3.0% 5.712e+08 perf-stat.ps.dTLB-stores
91889 ± 4% +10.2% 101265 perf-stat.ps.iTLB-loads
89770 ± 4% +6.8% 95880 perf-stat.ps.minor-faults
89771 ± 4% +6.8% 95880 perf-stat.ps.page-faults
Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.
--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki