Re: [linux-next:master] [mm/hugetlb_vmemmap] 875fa64577: vm-scalability.throughput -34.3% regression

From: Oliver Sang
Date: Fri Jul 19 2024 - 04:44:23 EST


hi, Yu Zhao,

On Wed, Jul 17, 2024 at 09:44:33AM -0600, Yu Zhao wrote:
> On Wed, Jul 17, 2024 at 2:36 AM Yu Zhao <yuzhao@xxxxxxxxxx> wrote:
> >
> > Hi Janosch and Oliver,
> >
> > On Wed, Jul 17, 2024 at 1:57 AM Janosch Frank <frankja@xxxxxxxxxxxxx> wrote:
> > >
> > > On 7/9/24 07:11, kernel test robot wrote:
> > > > Hello,
> > > >
> > > > kernel test robot noticed a -34.3% regression of vm-scalability.throughput on:
> > > >
> > > >
> > > > commit: 875fa64577da9bc8e9963ee14fef8433f20653e7 ("mm/hugetlb_vmemmap: fix race with speculative PFN walkers")
> > > > https://git.kernel.org/cgit/linux/kernel/git/next/linux-next.git master
> > > >
> > > > [still regression on linux-next/master 0b58e108042b0ed28a71cd7edf5175999955b233]
> > > >
> > > This has hit s390 huge page backed KVM guests as well.
> > > Our simple start/stop test case went from ~5 to over 50 seconds of runtime.
> >
> > Could you try the attached patch please? Thank you.
>
> Thanks, Yosry, for spotting the following typo:
> flags &= VMEMMAP_SYNCHRONIZE_RCU;
> It's supposed to be:
> flags &= ~VMEMMAP_SYNCHRONIZE_RCU;
>
> Reattaching v2 with the above typo fixed. Please let me know, Janosch & Oliver.

since the commit is in mainline now, I directly apply your v2 patch upon
bd225530a4c71 ("mm/hugetlb_vmemmap: fix race with speculative PFN walkers")

in our tests, your v2 patch not only recovers the performance regression, it
even has +13.7% performance improvement than 5a4d8944d6b1e (parent of
bd225530a4c71)

detail is as below

=========================================================================================
compiler/cpufreq_governor/kconfig/rootfs/runtime/size/tbox_group/test/testcase:
gcc-13/performance/x86_64-rhel-8.3/debian-12-x86_64-20240206.cgz/300s/512G/lkp-icl-2sp2/anon-cow-rand-hugetlb/vm-scalability

commit:
5a4d8944d6b1e ("cachestat: do not flush stats in recency check")
bd225530a4c71 ("mm/hugetlb_vmemmap: fix race with speculative PFN walkers")
9a5b87b521401 <---- your v2 patch

5a4d8944d6b1e1aa bd225530a4c717714722c373144 9a5b87b5214018a2be217dc4648
---------------- --------------------------- ---------------------------
%stddev %change %stddev %change %stddev
\ | \ | \
4.271e+09 ± 10% +348.4% 1.915e+10 ± 6% -39.9% 2.567e+09 ± 20% cpuidle..time
774593 ± 4% +1060.9% 8992186 ± 6% -17.2% 641254 cpuidle..usage
555365 ± 8% +28.0% 710795 ± 2% -4.5% 530157 ± 5% numa-numastat.node0.local_node
629633 ± 4% +23.0% 774346 ± 5% +0.6% 633264 ± 4% numa-numastat.node0.numa_hit
255.76 ± 2% +31.1% 335.40 ± 3% -13.8% 220.53 ± 2% uptime.boot
10305 ± 6% +144.3% 25171 ± 5% -17.1% 8543 ± 8% uptime.idle
1.83 ± 58% +96200.0% 1765 ±155% +736.4% 15.33 ± 24% perf-c2c.DRAM.local
33.00 ± 16% +39068.2% 12925 ±122% +95.5% 64.50 ± 49% perf-c2c.DRAM.remote
21.33 ± 8% +2361.7% 525.17 ± 31% +271.1% 79.17 ± 52% perf-c2c.HITM.local
9.17 ± 21% +3438.2% 324.33 ± 57% +270.9% 34.00 ± 60% perf-c2c.HITM.remote
16.11 ± 7% +37.1 53.16 ± 2% -4.6 11.50 ± 19% mpstat.cpu.all.idle%
0.34 ± 2% -0.1 0.22 +0.0 0.35 ± 3% mpstat.cpu.all.irq%
0.03 ± 5% +0.0 0.04 ± 8% -0.0 0.02 mpstat.cpu.all.soft%
10.58 ± 4% -9.5 1.03 ± 36% +0.1 10.71 ± 2% mpstat.cpu.all.sys%
72.94 ± 2% -27.4 45.55 ± 3% +4.5 77.41 ± 2% mpstat.cpu.all.usr%
6.00 ± 16% +230.6% 19.83 ± 5% +8.3% 6.50 ± 17% mpstat.max_utilization.seconds
16.95 ± 7% +215.5% 53.48 ± 2% -26.2% 12.51 ± 16% vmstat.cpu.id
72.33 ± 2% -37.4% 45.31 ± 3% +6.0% 76.65 ± 2% vmstat.cpu.us
2.254e+08 -0.0% 2.254e+08 +14.7% 2.584e+08 vmstat.memory.free
108.30 -43.3% 61.43 ± 2% +5.4% 114.12 ± 2% vmstat.procs.r
2659 +162.6% 6982 ± 3% +3.6% 2753 ± 4% vmstat.system.cs
136384 ± 4% -21.9% 106579 ± 7% +13.3% 154581 ± 3% vmstat.system.in
203.41 ± 2% +39.2% 283.06 ± 4% -17.1% 168.71 ± 2% time.elapsed_time
203.41 ± 2% +39.2% 283.06 ± 4% -17.1% 168.71 ± 2% time.elapsed_time.max
148901 ± 6% -45.6% 81059 ± 4% -8.8% 135748 ± 8% time.involuntary_context_switches
169.83 ± 23% +85.3% 314.67 ± 8% +7.9% 183.33 ± 7% time.major_page_faults
10697 -43.4% 6050 ± 2% +5.6% 11294 ± 2% time.percent_of_cpu_this_job_got
2740 ± 6% -86.7% 365.06 ± 43% -16.1% 2298 time.system_time
19012 -11.9% 16746 -11.9% 16747 time.user_time
14412 ± 5% +4432.0% 653187 -16.6% 12025 ± 3% time.voluntary_context_switches
50095 ± 2% -31.5% 34325 ± 2% +18.6% 59408 vm-scalability.median
8.25 ± 16% -3.4 4.84 ± 22% -6.6 1.65 ± 15% vm-scalability.median_stddev%
6863720 -34.0% 4532485 +13.7% 7805408 vm-scalability.throughput
203.41 ± 2% +39.2% 283.06 ± 4% -17.1% 168.71 ± 2% vm-scalability.time.elapsed_time
203.41 ± 2% +39.2% 283.06 ± 4% -17.1% 168.71 ± 2% vm-scalability.time.elapsed_time.max
148901 ± 6% -45.6% 81059 ± 4% -8.8% 135748 ± 8% vm-scalability.time.involuntary_context_switches
10697 -43.4% 6050 ± 2% +5.6% 11294 ± 2% vm-scalability.time.percent_of_cpu_this_job_got
2740 ± 6% -86.7% 365.06 ± 43% -16.1% 2298 vm-scalability.time.system_time
19012 -11.9% 16746 -11.9% 16747 vm-scalability.time.user_time
14412 ± 5% +4432.0% 653187 -16.6% 12025 ± 3% vm-scalability.time.voluntary_context_switches
1.159e+09 +0.0% 1.159e+09 +1.6% 1.178e+09 vm-scalability.workload
22900043 ± 4% +1.2% 23166356 ± 6% -16.7% 19076170 ± 5% numa-vmstat.node0.nr_free_pages
42856 ± 43% +998.5% 470779 ± 51% +318.6% 179409 ±154% numa-vmstat.node0.nr_unevictable
42856 ± 43% +998.5% 470779 ± 51% +318.6% 179409 ±154% numa-vmstat.node0.nr_zone_unevictable
629160 ± 4% +22.9% 773391 ± 5% +0.5% 632570 ± 4% numa-vmstat.node0.numa_hit
554892 ± 8% +27.9% 709841 ± 2% -4.6% 529463 ± 5% numa-vmstat.node0.numa_local
27469 ± 14% +0.0% 27475 ± 41% -31.7% 18763 ± 13% numa-vmstat.node1.nr_active_anon
767179 ± 2% -55.8% 339212 ± 72% -19.7% 616417 ± 43% numa-vmstat.node1.nr_file_pages
10693349 ± 5% +46.3% 15639681 ± 7% +69.4% 18112002 ± 3% numa-vmstat.node1.nr_free_pages
14210 ± 27% -65.0% 4973 ± 49% -34.7% 9280 ± 39% numa-vmstat.node1.nr_mapped
724050 ± 2% -59.1% 296265 ± 82% -18.9% 587498 ± 47% numa-vmstat.node1.nr_unevictable
27469 ± 14% +0.0% 27475 ± 41% -31.7% 18763 ± 13% numa-vmstat.node1.nr_zone_active_anon
724050 ± 2% -59.1% 296265 ± 82% -18.9% 587498 ± 47% numa-vmstat.node1.nr_zone_unevictable
120619 ± 11% +13.6% 137042 ± 27% -31.2% 82976 ± 7% meminfo.Active
120472 ± 11% +13.6% 136895 ± 27% -31.2% 82826 ± 7% meminfo.Active(anon)
70234807 +14.6% 80512468 +10.2% 77431344 meminfo.CommitLimit
2.235e+08 +0.1% 2.237e+08 +15.1% 2.573e+08 meminfo.DirectMap1G
44064 -22.8% 34027 ± 2% +20.7% 53164 ± 2% meminfo.HugePages_Surp
44064 -22.8% 34027 ± 2% +20.7% 53164 ± 2% meminfo.HugePages_Total
90243440 -22.8% 69688103 ± 2% +20.7% 1.089e+08 ± 2% meminfo.Hugetlb
70163 ± 29% -42.6% 40293 ± 11% -21.9% 54789 ± 15% meminfo.Mapped
1.334e+08 +15.5% 1.541e+08 +10.7% 1.477e+08 meminfo.MemAvailable
1.344e+08 +15.4% 1.551e+08 +10.7% 1.488e+08 meminfo.MemFree
2.307e+08 +0.0% 2.307e+08 +14.3% 2.637e+08 meminfo.MemTotal
96309843 -21.5% 75639108 ± 2% +19.4% 1.15e+08 ± 2% meminfo.Memused
259553 ± 2% -0.9% 257226 ± 15% -10.5% 232211 ± 4% meminfo.Shmem
1.2e+08 -2.4% 1.172e+08 +13.3% 1.36e+08 meminfo.max_used_kB
18884 ± 10% -7.2% 17519 ± 15% +37.6% 25983 ± 6% numa-meminfo.node0.HugePages_Surp
18884 ± 10% -7.2% 17519 ± 15% +37.6% 25983 ± 6% numa-meminfo.node0.HugePages_Total
91526744 ± 4% +1.2% 92620825 ± 6% -16.7% 76248423 ± 5% numa-meminfo.node0.MemFree
40158207 ± 9% -2.7% 39064126 ± 15% +38.0% 55436528 ± 7% numa-meminfo.node0.MemUsed
171426 ± 43% +998.5% 1883116 ± 51% +318.6% 717638 ±154% numa-meminfo.node0.Unevictable
110091 ± 14% -0.1% 109981 ± 41% -31.7% 75226 ± 13% numa-meminfo.node1.Active
110025 ± 14% -0.1% 109915 ± 41% -31.7% 75176 ± 13% numa-meminfo.node1.Active(anon)
3068496 ± 2% -55.8% 1356754 ± 72% -19.6% 2466084 ± 43% numa-meminfo.node1.FilePages
25218 ± 4% -34.7% 16475 ± 12% +7.9% 27213 ± 3% numa-meminfo.node1.HugePages_Surp
25218 ± 4% -34.7% 16475 ± 12% +7.9% 27213 ± 3% numa-meminfo.node1.HugePages_Total
55867 ± 27% -65.5% 19266 ± 50% -34.4% 36671 ± 38% numa-meminfo.node1.Mapped
42795888 ± 5% +46.1% 62520130 ± 7% +69.3% 72441496 ± 3% numa-meminfo.node1.MemFree
99028084 +0.0% 99028084 +33.4% 1.321e+08 numa-meminfo.node1.MemTotal
56232195 ± 3% -35.1% 36507953 ± 12% +6.0% 59616707 ± 4% numa-meminfo.node1.MemUsed
2896199 ± 2% -59.1% 1185064 ± 82% -18.9% 2349991 ± 47% numa-meminfo.node1.Unevictable
507357 +0.0% 507357 +1.7% 516000 proc-vmstat.htlb_buddy_alloc_success
29942 ± 10% +14.3% 34235 ± 27% -30.7% 20740 ± 7% proc-vmstat.nr_active_anon
3324095 +15.7% 3847387 +10.9% 3686860 proc-vmstat.nr_dirty_background_threshold
6656318 +15.7% 7704181 +10.9% 7382735 proc-vmstat.nr_dirty_threshold
33559092 +15.6% 38798108 +10.9% 37209133 proc-vmstat.nr_free_pages
197697 ± 2% -2.5% 192661 +1.0% 199623 proc-vmstat.nr_inactive_anon
17939 ± 28% -42.5% 10307 ± 11% -22.4% 13927 ± 14% proc-vmstat.nr_mapped
2691 -7.1% 2501 +2.9% 2769 proc-vmstat.nr_page_table_pages
64848 ± 2% -0.7% 64386 ± 15% -10.6% 57987 ± 4% proc-vmstat.nr_shmem
29942 ± 10% +14.3% 34235 ± 27% -30.7% 20740 ± 7% proc-vmstat.nr_zone_active_anon
197697 ± 2% -2.5% 192661 +1.0% 199623 proc-vmstat.nr_zone_inactive_anon
1403095 +9.3% 1534152 ± 2% -3.2% 1358244 proc-vmstat.numa_hit
1267544 +10.6% 1401482 ± 2% -3.4% 1224210 proc-vmstat.numa_local
2.608e+08 +0.1% 2.609e+08 +1.7% 2.651e+08 proc-vmstat.pgalloc_normal
1259957 +13.4% 1428284 ± 2% -6.5% 1178198 proc-vmstat.pgfault
2.591e+08 +0.3% 2.6e+08 +2.3% 2.649e+08 proc-vmstat.pgfree
36883 ± 3% +18.5% 43709 ± 5% -12.2% 32371 ± 3% proc-vmstat.pgreuse
1.88 ± 16% -0.6 1.33 ±100% +0.9 2.80 ± 11% perf-profile.calltrace.cycles-pp.nrand48_r
16.19 ± 85% +28.6 44.75 ± 95% -11.4 4.78 ±218% perf-profile.calltrace.cycles-pp.hugetlb_fault.handle_mm_fault.do_user_addr_fault.exc_page_fault.asm_exc_page_fault
16.20 ± 85% +28.6 44.78 ± 95% -11.4 4.78 ±218% perf-profile.calltrace.cycles-pp.handle_mm_fault.do_user_addr_fault.exc_page_fault.asm_exc_page_fault.do_access
16.22 ± 85% +28.6 44.82 ± 95% -11.4 4.79 ±218% perf-profile.calltrace.cycles-pp.exc_page_fault.asm_exc_page_fault.do_access
16.22 ± 85% +28.6 44.82 ± 95% -11.4 4.79 ±218% perf-profile.calltrace.cycles-pp.do_user_addr_fault.exc_page_fault.asm_exc_page_fault.do_access
16.24 ± 85% +28.8 45.01 ± 95% -11.4 4.80 ±218% perf-profile.calltrace.cycles-pp.asm_exc_page_fault.do_access
12.42 ± 84% +29.5 41.89 ± 95% -8.8 3.65 ±223% perf-profile.calltrace.cycles-pp.copy_mc_enhanced_fast_string.copy_subpage.copy_user_large_folio.hugetlb_wp.hugetlb_fault
12.52 ± 84% +29.6 42.08 ± 95% -8.8 3.68 ±223% perf-profile.calltrace.cycles-pp.copy_subpage.copy_user_large_folio.hugetlb_wp.hugetlb_fault.handle_mm_fault
12.53 ± 84% +29.7 42.23 ± 95% -8.9 3.68 ±223% perf-profile.calltrace.cycles-pp.copy_user_large_folio.hugetlb_wp.hugetlb_fault.handle_mm_fault.do_user_addr_fault
12.80 ± 84% +30.9 43.65 ± 95% -9.0 3.76 ±223% perf-profile.calltrace.cycles-pp.hugetlb_wp.hugetlb_fault.handle_mm_fault.do_user_addr_fault.exc_page_fault
2.50 ± 17% -0.7 1.78 ±100% +1.2 3.73 ± 11% perf-profile.children.cycles-pp.nrand48_r
16.24 ± 85% +28.6 44.87 ± 95% -11.4 4.79 ±218% perf-profile.children.cycles-pp.do_user_addr_fault
16.24 ± 85% +28.6 44.87 ± 95% -11.4 4.79 ±218% perf-profile.children.cycles-pp.exc_page_fault
16.20 ± 85% +28.7 44.86 ± 95% -11.4 4.78 ±218% perf-profile.children.cycles-pp.hugetlb_fault
16.22 ± 85% +28.7 44.94 ± 95% -11.4 4.79 ±218% perf-profile.children.cycles-pp.handle_mm_fault
16.26 ± 85% +28.8 45.06 ± 95% -11.5 4.80 ±218% perf-profile.children.cycles-pp.asm_exc_page_fault
12.51 ± 84% +29.5 42.01 ± 95% -8.8 3.75 ±218% perf-profile.children.cycles-pp.copy_mc_enhanced_fast_string
12.52 ± 84% +29.6 42.11 ± 95% -8.8 3.75 ±218% perf-profile.children.cycles-pp.copy_subpage
12.53 ± 84% +29.7 42.25 ± 95% -8.8 3.76 ±218% perf-profile.children.cycles-pp.copy_user_large_folio
12.80 ± 84% +30.9 43.65 ± 95% -9.0 3.83 ±218% perf-profile.children.cycles-pp.hugetlb_wp
2.25 ± 17% -0.7 1.59 ±100% +1.1 3.36 ± 11% perf-profile.self.cycles-pp.nrand48_r
1.74 ± 21% -0.5 1.25 ± 92% +1.2 2.94 ± 13% perf-profile.self.cycles-pp.do_access
0.27 ± 17% -0.1 0.19 ±100% +0.1 0.40 ± 11% perf-profile.self.cycles-pp.lrand48_r
12.41 ± 84% +29.4 41.80 ± 95% -8.7 3.72 ±218% perf-profile.self.cycles-pp.copy_mc_enhanced_fast_string
350208 ± 16% -2.7% 340891 ± 36% -47.2% 184918 ± 9% sched_debug.cfs_rq:/.avg_vruntime.stddev
16833 ±149% -100.0% 3.19 ±100% -100.0% 0.58 ±179% sched_debug.cfs_rq:/.left_deadline.avg
2154658 ±149% -100.0% 317.15 ± 93% -100.0% 74.40 ±179% sched_debug.cfs_rq:/.left_deadline.max
189702 ±149% -100.0% 29.47 ± 94% -100.0% 6.55 ±179% sched_debug.cfs_rq:/.left_deadline.stddev
16833 ±149% -100.0% 3.05 ±102% -100.0% 0.58 ±179% sched_debug.cfs_rq:/.left_vruntime.avg
2154613 ±149% -100.0% 298.70 ± 95% -100.0% 74.06 ±179% sched_debug.cfs_rq:/.left_vruntime.max
189698 ±149% -100.0% 27.96 ± 96% -100.0% 6.52 ±179% sched_debug.cfs_rq:/.left_vruntime.stddev
350208 ± 16% -2.7% 340891 ± 36% -47.2% 184918 ± 9% sched_debug.cfs_rq:/.min_vruntime.stddev
52.88 ± 14% -19.5% 42.56 ± 39% +22.8% 64.94 ± 9% sched_debug.cfs_rq:/.removed.load_avg.stddev
16833 ±149% -100.0% 3.05 ±102% -100.0% 0.58 ±179% sched_debug.cfs_rq:/.right_vruntime.avg
2154613 ±149% -100.0% 298.70 ± 95% -100.0% 74.11 ±179% sched_debug.cfs_rq:/.right_vruntime.max
189698 ±149% -100.0% 27.96 ± 96% -100.0% 6.53 ±179% sched_debug.cfs_rq:/.right_vruntime.stddev
1588 ± 9% -31.2% 1093 ± 18% -20.0% 1270 ± 16% sched_debug.cfs_rq:/.runnable_avg.max
676.36 ± 7% -94.8% 35.08 ± 42% -2.7% 657.82 ± 3% sched_debug.cfs_rq:/.util_est.avg
1339 ± 8% -74.5% 341.42 ± 24% -22.6% 1037 ± 23% sched_debug.cfs_rq:/.util_est.max
152.67 ± 35% -72.3% 42.35 ± 21% -14.9% 129.89 ± 33% sched_debug.cfs_rq:/.util_est.stddev
1116839 ± 7% -7.1% 1037321 ± 4% +22.9% 1372316 ± 11% sched_debug.cpu.avg_idle.max
126915 ± 10% +31.6% 166966 ± 6% -12.2% 111446 ± 2% sched_debug.cpu.clock.avg
126930 ± 10% +31.6% 166977 ± 6% -12.2% 111459 ± 2% sched_debug.cpu.clock.max
126899 ± 10% +31.6% 166949 ± 6% -12.2% 111428 ± 2% sched_debug.cpu.clock.min
126491 ± 10% +31.7% 166537 ± 6% -12.2% 111078 ± 2% sched_debug.cpu.clock_task.avg
126683 ± 10% +31.6% 166730 ± 6% -12.2% 111237 ± 2% sched_debug.cpu.clock_task.max
117365 ± 11% +33.6% 156775 ± 6% -13.0% 102099 ± 2% sched_debug.cpu.clock_task.min
2826 ± 10% +178.1% 7858 ± 8% -10.3% 2534 ± 6% sched_debug.cpu.nr_switches.avg
755.38 ± 15% +423.8% 3956 ± 14% -15.2% 640.33 ± 3% sched_debug.cpu.nr_switches.min
126900 ± 10% +31.6% 166954 ± 6% -12.2% 111432 ± 2% sched_debug.cpu_clk
125667 ± 10% +31.9% 165721 ± 6% -12.3% 110200 ± 2% sched_debug.ktime
0.54 ±141% -99.9% 0.00 ±132% -99.9% 0.00 ±114% sched_debug.rt_rq:.rt_time.avg
69.73 ±141% -99.9% 0.06 ±132% -99.9% 0.07 ±114% sched_debug.rt_rq:.rt_time.max
6.14 ±141% -99.9% 0.01 ±132% -99.9% 0.01 ±114% sched_debug.rt_rq:.rt_time.stddev
127860 ± 10% +31.3% 167917 ± 6% -12.1% 112402 ± 2% sched_debug.sched_clk
15.99 +363.6% 74.14 ± 6% +10.1% 17.61 perf-stat.i.MPKI
1.467e+10 ± 2% -32.0% 9.975e+09 ± 3% +21.3% 1.779e+10 ± 2% perf-stat.i.branch-instructions
0.10 ± 5% +0.6 0.68 ± 5% +0.0 0.11 ± 4% perf-stat.i.branch-miss-rate%
10870114 ± 3% -26.4% 8001551 ± 3% +15.7% 12580898 ± 2% perf-stat.i.branch-misses
97.11 -20.0 77.11 -0.0 97.10 perf-stat.i.cache-miss-rate%
8.118e+08 ± 2% -32.5% 5.482e+08 ± 3% +23.1% 9.992e+08 ± 2% perf-stat.i.cache-misses
8.328e+08 ± 2% -28.4% 5.963e+08 ± 3% +22.8% 1.023e+09 ± 2% perf-stat.i.cache-references
2601 ± 2% +172.3% 7083 ± 3% +2.5% 2665 ± 5% perf-stat.i.context-switches
5.10 +39.5% 7.11 ± 9% -9.2% 4.62 perf-stat.i.cpi
2.826e+11 -44.1% 1.58e+11 ± 2% +5.7% 2.987e+11 ± 2% perf-stat.i.cpu-cycles
216.56 +42.4% 308.33 ± 6% +2.2% 221.23 perf-stat.i.cpu-migrations
358.79 -0.3% 357.70 ± 21% -14.1% 308.23 perf-stat.i.cycles-between-cache-misses
6.286e+10 ± 2% -31.7% 4.293e+10 ± 3% +21.3% 7.626e+10 ± 2% perf-stat.i.instructions
0.24 +39.9% 0.33 ± 4% +13.6% 0.27 perf-stat.i.ipc
5844 -16.9% 4856 ± 2% +12.5% 6577 perf-stat.i.minor-faults
5846 -16.9% 4857 ± 2% +12.5% 6578 perf-stat.i.page-faults
13.00 -2.2% 12.72 +1.2% 13.15 perf-stat.overall.MPKI
0.07 +0.0 0.08 -0.0 0.07 perf-stat.overall.branch-miss-rate%
97.44 -5.3 92.09 +0.2 97.66 perf-stat.overall.cache-miss-rate%
4.51 -18.4% 3.68 -13.0% 3.92 perf-stat.overall.cpi
346.76 -16.6% 289.11 -14.0% 298.06 perf-stat.overall.cycles-between-cache-misses
0.22 +22.6% 0.27 +15.0% 0.26 perf-stat.overall.ipc
10906 -3.4% 10541 -1.1% 10784 perf-stat.overall.path-length
1.445e+10 ± 2% -30.7% 1.001e+10 ± 3% +21.2% 1.752e+10 ± 2% perf-stat.ps.branch-instructions
10469697 ± 3% -23.5% 8005730 ± 3% +18.3% 12387061 ± 2% perf-stat.ps.branch-misses
8.045e+08 ± 2% -31.9% 5.478e+08 ± 3% +22.7% 9.874e+08 ± 2% perf-stat.ps.cache-misses
8.257e+08 ± 2% -27.9% 5.95e+08 ± 3% +22.5% 1.011e+09 ± 2% perf-stat.ps.cache-references
2584 ± 2% +169.3% 6958 ± 3% +2.7% 2654 ± 4% perf-stat.ps.context-switches
2.789e+11 -43.2% 1.583e+11 ± 2% +5.5% 2.943e+11 ± 2% perf-stat.ps.cpu-cycles
214.69 +41.8% 304.37 ± 6% +2.2% 219.46 perf-stat.ps.cpu-migrations
6.19e+10 ± 2% -30.4% 4.309e+10 ± 3% +21.3% 7.507e+10 ± 2% perf-stat.ps.instructions
5849 -18.0% 4799 ± 2% +12.3% 6568 ± 2% perf-stat.ps.minor-faults
5851 -18.0% 4800 ± 2% +12.3% 6570 ± 2% perf-stat.ps.page-faults
1.264e+13 -3.4% 1.222e+13 +0.5% 1.27e+13 perf-stat.total.instructions