Re: [lkp] [shmem] 071904e8df: meminfo.AnonHugePages +553.5% increase
From: Ye Xiaolong
Date: Thu Aug 04 2016 - 23:20:38 EST
On 08/04, Kirill A. Shutemov wrote:
>On Thu, Aug 04, 2016 at 04:54:09PM +0800, kernel test robot wrote:
>>
>> FYI, we noticed meminfo.AnonHugePages +553.5% increase due to commit:
>>
>> commit 071904e8dfed9525f9da86523caf78b6da5f9e7e ("shmem: get_unmapped_area align huge page")
>> https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git master
>>
>> in testcase: vm-scalability
>> on test machine: 128 threads 4 Sockets Haswell-EP with 512G memory
>> with following parameters:
>>
>> path_params: 300s-16G-shm-pread-rand-mt-performance
>> run:
>>
>>
>>
>> Disclaimer:
>> Results have been estimated based on internal Intel analysis and are provided
>> for informational purposes only. Any difference in system hardware or software
>> design or configuration may affect actual performance.
>>
>> Details are as below:
>> -------------------------------------------------------------------------------------------------->
>>
>>
>> To reproduce:
>>
>> git clone git://git.kernel.org/pub/scm/linux/kernel/git/wfg/lkp-tests.git
>> cd lkp-tests
>> bin/lkp install job.yaml # job file is attached in this email
>> bin/lkp run job.yaml
>>
>> =========================================================================================
>> compiler/cpufreq_governor/kconfig/rootfs/runtime/size/tbox_group/test/testcase:
>> gcc-6/performance/x86_64-rhel/debian-x86_64-2015-02-07.cgz/300s/16G/lkp-hsw-4ep1/shm-pread-rand-mt/vm-scalability
>>
>> commit:
>> 6a028aa3ca ("shmem: prepare huge= mount option and sysfs knob")
>> 071904e8df ("shmem: get_unmapped_area align huge page")
>>
>> 6a028aa3ca32379e 071904e8dfed9525f9da86523c
>> ---------------- --------------------------
>> %stddev %change %stddev
>> \ | \
>> 20428 ± 9% +553.5% 133500 ± 7% meminfo.AnonHugePages
>> 42717 ± 4% +261.3% 154340 ± 6% meminfo.AnonPages
>
>Hm. That's strange. I didn't expect this commit change anything for anon
>memory.
>
>Do you see the same effect from the commit in Linus' tree?
Yes, here the comparison between the commit in Linus' tree (c01d5b3007)
and its parent (5a6e75f811 "shmem: prepare huge= mount option and sysfs
knob")
=========================================================================================
compiler/cpufreq_governor/kconfig/rootfs/runtime/size/tbox_group/test/testcase:
gcc-6/performance/x86_64-rhel/debian-x86_64-2015-02-07.cgz/300s/16G/lkp-hsw-4ep1/shm-pread-rand-mt/vm-scalability
commit:
5a6e75f8110c97e2a5488894d4e922187e6cb343
c01d5b300774d130a24d787825b01eb24e6e20cb
5a6e75f8110c97e2 c01d5b300774d130a24d787825
---------------- --------------------------
fail:runs %reproduction fail:runs
| | |
%stddev %change %stddev
\ | \
533.92 ± 5% +7.8% 575.31 ± 4% vm-scalability.time.elapsed_time
533.92 ± 5% +7.8% 575.31 ± 4% vm-scalability.time.elapsed_time.max
1.567e+09 ± 7% +31.0% 2.053e+09 ± 11% vm-scalability.time.maximum_resident_set_size
7141 ± 5% -7.0% 6638 ± 5% vm-scalability.time.percent_of_cpu_this_job_got
1708341 ± 29% +48.3% 2534068 ± 21% vm-scalability.time.voluntary_context_switches
5379895 ± 12% +94.3% 10454779 ± 13% meminfo.Active
5351287 ± 12% +94.8% 10425228 ± 13% meminfo.Active(anon)
20673 ± 1% +570.9% 138687 ± 4% meminfo.AnonHugePages
42881 ± 2% +276.2% 161340 ± 4% meminfo.AnonPages
9570950 ± 7% +93.3% 18504763 ± 10% meminfo.Cached
9461500 ± 7% +94.3% 18381777 ± 11% meminfo.Committed_AS
6363833 ± 7% +11.3% 7083357 ± 0% meminfo.DirectMap2M
4233226 ± 1% +94.0% 8210642 ± 7% meminfo.Inactive
3759914 ± 1% +105.8% 7738210 ± 7% meminfo.Inactive(anon)
8500775 ± 8% +72.3% 14648157 ± 12% meminfo.Mapped
546828 ± 2% +11.2% 608309 ± 1% meminfo.SReclaimable
9069484 ± 7% +98.5% 18003233 ± 11% meminfo.Shmem
1.567e+09 ± 7% +31.0% 2.053e+09 ± 11% time.maximum_resident_set_size
1708341 ± 29% +48.3% 2534068 ± 21% time.voluntary_context_switches
10225359 ± 7% +87.6% 19185983 ± 10% vmstat.memory.cache
9386 ± 14% +24.7% 11702 ± 10% vmstat.system.cs
4395191 ± 1% +64.1% 7210638 ± 16% numa-numastat.node0.local_node
4395191 ± 1% +64.1% 7210639 ± 16% numa-numastat.node0.numa_hit
2.33 ± 20% -85.7% 0.33 ±141% numa-numastat.node3.other_node
838804 ± 2% +12.2% 941216 ± 1% slabinfo.radix_tree_node.active_objs
15101 ± 2% +12.6% 16999 ± 1% slabinfo.radix_tree_node.active_slabs
845689 ± 2% +12.6% 951997 ± 1% slabinfo.radix_tree_node.num_objs
15101 ± 2% +12.6% 16999 ± 1% slabinfo.radix_tree_node.num_slabs
58.13 ± 5% -6.4% 54.39 ± 4% turbostat.%Busy
697.67 ± 5% -6.5% 652.33 ± 4% turbostat.Avg_MHz
28.10 ± 7% +9.6% 30.81 ± 5% turbostat.CPU%c6
17.67 ± 8% +12.4% 19.86 ± 6% turbostat.Pkg%pc2
124.17 ± 1% -1.9% 121.83 ± 1% turbostat.PkgWatt
53777848 ± 31% +41.2% 75939221 ± 19% cpuidle.C1-HSW.time
119888 ± 12% +23.2% 147684 ± 7% cpuidle.C1E-HSW.usage
51609995 ± 2% +11.1% 57338896 ± 5% cpuidle.C3-HSW.time
2.895e+10 ± 12% +17.2% 3.392e+10 ± 10% cpuidle.C6-HSW.time
30211909 ± 12% +17.1% 35368627 ± 9% cpuidle.C6-HSW.usage
20912 ± 33% +44.4% 30196 ± 24% cpuidle.POLL.usage
4381 ± 6% +804.5% 39633 ± 27% latency_stats.max.call_rwsem_down_read_failed.__do_page_fault.do_page_fault.page_fault
1296 ± 14% +1387.3% 19285 ± 13% latency_stats.max.call_rwsem_down_write_failed_killable.SyS_mprotect.entry_SYSCALL_64_fastpath
1360 ± 1% +2784.4% 39246 ± 28% latency_stats.max.call_rwsem_down_write_failed_killable.vm_mmap_pgoff.SyS_mmap_pgoff.SyS_mmap.entry_SYSCALL_64_fastpath
85887 ± 9% +161.7% 224726 ± 26% latency_stats.sum.call_rwsem_down_write_failed_killable.vm_mmap_pgoff.SyS_mmap_pgoff.SyS_mmap.entry_SYSCALL_64_fastpath
16064 ± 47% +729.7% 133283 ±116% latency_stats.sum.down.console_lock.console_device.tty_open.chrdev_open.do_dentry_open.vfs_open.path_openat.do_filp_open.do_sys_open.SyS_open.entry_SYSCALL_64_fastpath
96.33 ±107% +28900.7% 27937 ± 17% latency_stats.sum.stop_two_cpus.migrate_swap.task_numa_migrate.numa_migrate_preferred.task_numa_fault.do_huge_pmd_numa_page.handle_mm_fault.__do_page_fault.do_page_fault.page_fault
48184 ± 21% -87.5% 6011 ± 46% latency_stats.sum.stop_two_cpus.migrate_swap.task_numa_migrate.numa_migrate_preferred.task_numa_fault.handle_mm_fault.__do_page_fault.do_page_fault.page_fault
1.89 ± 2% +3.3% 1.96 ± 2% perf-stat.branch-miss-rate
5031519 ± 19% +34.2% 6750057 ± 14% perf-stat.context-switches
5.336e+09 ± 8% +19.8% 6.39e+09 ± 11% perf-stat.dTLB-load-misses
2.922e+08 ± 26% +41.0% 4.119e+08 ± 13% perf-stat.dTLB-store-misses
2.82e+11 ± 9% +16.2% 3.275e+11 ± 9% perf-stat.dTLB-stores
4.347e+08 ± 6% +12.2% 4.878e+08 ± 8% perf-stat.iTLB-load-misses
3764 ± 1% -4.2% 3605 ± 1% perf-stat.instructions-per-iTLB-miss
1.14 ± 7% +10.0% 1.25 ± 3% perf-profile.cycles-pp.alloc_set_pte.filemap_map_pages.handle_mm_fault.__do_page_fault.do_page_fault
1.58 ± 13% -23.5% 1.21 ± 11% perf-profile.cycles-pp.call_cpuidle.cpu_startup_entry.start_secondary
1.57 ± 13% -23.3% 1.21 ± 11% perf-profile.cycles-pp.cpuidle_enter.call_cpuidle.cpu_startup_entry.start_secondary
1.57 ± 13% -23.4% 1.20 ± 11% perf-profile.cycles-pp.cpuidle_enter_state.cpuidle_enter.call_cpuidle.cpu_startup_entry.start_secondary
1.56 ± 13% -23.5% 1.20 ± 11% perf-profile.cycles-pp.intel_idle.cpuidle_enter_state.cpuidle_enter.call_cpuidle.cpu_startup_entry
2.50 ± 35% -39.7% 1.51 ± 5% perf-profile.cycles-pp.radix_tree_next_chunk.filemap_map_pages.handle_mm_fault.__do_page_fault.do_page_fault
1.38 ± 2% +19.4% 1.64 ± 5% perf-profile.cycles-pp.unlock_page.filemap_map_pages.handle_mm_fault.__do_page_fault.do_page_fault
4.52 ± 7% +13.2% 5.12 ± 7% perf-profile.func.cycles-pp.__do_page_fault
1.57 ± 13% -22.5% 1.22 ± 11% perf-profile.func.cycles-pp.intel_idle
2.51 ± 35% -39.6% 1.51 ± 5% perf-profile.func.cycles-pp.radix_tree_next_chunk
1.15 ± 2% +17.1% 1.35 ± 2% perf-profile.func.cycles-pp.unlock_page
10850 ± 14% +8227.4% 903577 ± 38% numa-vmstat.node0.nr_active_anon
2176 ± 38% +367.8% 10182 ± 14% numa-vmstat.node0.nr_anon_pages
66676 ± 42% +2339.4% 1626473 ± 35% numa-vmstat.node0.nr_file_pages
26629 ±102% +2535.0% 701677 ± 38% numa-vmstat.node0.nr_inactive_anon
11618 ± 48% +11166.1% 1308898 ± 37% numa-vmstat.node0.nr_mapped
35359 ± 79% +4411.2% 1595158 ± 36% numa-vmstat.node0.nr_shmem
6574 ± 13% +717.9% 53768 ± 48% numa-vmstat.node0.nr_slab_reclaimable
4237359 ± 1% +50.3% 6368198 ± 12% numa-vmstat.node0.numa_hit
4237359 ± 1% +50.3% 6368197 ± 12% numa-vmstat.node0.numa_local
2535 ± 41% +274.2% 9485 ± 3% numa-vmstat.node1.nr_anon_pages
2171 ± 37% +379.1% 10404 ± 12% numa-vmstat.node2.nr_anon_pages
3820 ± 28% +150.0% 9551 ± 13% numa-vmstat.node3.nr_anon_pages
1.33 ± 35% -100.0% 0.00 ± 0% numa-vmstat.node3.numa_other
11.41 ± 16% +53.1% 17.47 ± 11% sched_debug.cfs_rq:/.nr_spread_over.max
1.07 ± 10% +49.5% 1.60 ± 4% sched_debug.cfs_rq:/.nr_spread_over.stddev
-250444 ±-49% -310.3% 526574 ± 32% sched_debug.cfs_rq:/.spread0.avg
540891 ± 15% +149.0% 1346786 ± 13% sched_debug.cfs_rq:/.spread0.max
-2435794 ±-38% -78.7% -517627 ±-39% sched_debug.cfs_rq:/.spread0.min
2894052 ± 34% -37.3% 1813439 ± 19% sched_debug.cpu.avg_idle.max
7480 ± 13% +34.7% 10076 ± 11% sched_debug.cpu.curr->pid.max
717.88 ± 9% +31.8% 946.21 ± 10% sched_debug.cpu.curr->pid.stddev
0.57 ± 4% -6.8% 0.53 ± 7% sched_debug.cpu.nr_running.avg
26154 ± 20% +37.1% 35853 ± 17% sched_debug.cpu.nr_switches.avg
114.34 ± 20% +44.1% 164.81 ± 18% sched_debug.cpu.nr_uninterruptible.max
27422 ± 19% +35.2% 37086 ± 16% sched_debug.cpu.sched_count.avg
10181 ± 26% +45.8% 14845 ± 23% sched_debug.cpu.sched_goidle.avg
13214 ± 22% +38.9% 18349 ± 17% sched_debug.cpu.ttwu_count.avg
50734 ± 16% +20.0% 60878 ± 11% sched_debug.cpu.ttwu_count.max
49950 ± 12% +7032.4% 3562633 ± 38% numa-meminfo.node0.Active
42950 ± 14% +8177.8% 3555332 ± 38% numa-meminfo.node0.Active(anon)
3891 ± 75% +784.1% 34401 ± 19% numa-meminfo.node0.AnonHugePages
8729 ± 37% +371.5% 41161 ± 14% numa-meminfo.node0.AnonPages
266549 ± 42% +2327.8% 6471326 ± 35% numa-meminfo.node0.FilePages
225100 ± 48% +1210.3% 2949477 ± 37% numa-meminfo.node0.Inactive
106834 ±102% +2550.4% 2831519 ± 38% numa-meminfo.node0.Inactive(anon)
46397 ± 48% +11142.4% 5216201 ± 37% numa-meminfo.node0.Mapped
16087217 ± 1% +40.2% 22561797 ± 10% numa-meminfo.node0.MemUsed
26298 ± 13% +716.8% 214805 ± 48% numa-meminfo.node0.SReclaimable
141283 ± 80% +4391.7% 6346066 ± 36% numa-meminfo.node0.Shmem
70054 ± 7% +272.7% 261076 ± 38% numa-meminfo.node0.Slab
5637 ± 68% +491.8% 33364 ± 3% numa-meminfo.node1.AnonHugePages
10152 ± 41% +278.0% 38372 ± 4% numa-meminfo.node1.AnonPages
2605 ± 62% +1292.3% 36268 ± 13% numa-meminfo.node2.AnonHugePages
8703 ± 37% +384.3% 42147 ± 12% numa-meminfo.node2.AnonPages
8520 ± 43% +295.3% 33678 ± 17% numa-meminfo.node3.AnonHugePages
15289 ± 28% +153.0% 38686 ± 13% numa-meminfo.node3.AnonPages
1350786 ± 12% +94.8% 2630698 ± 13% proc-vmstat.nr_active_anon
10719 ± 2% +274.3% 40123 ± 4% proc-vmstat.nr_anon_pages
2398874 ± 7% +93.6% 4643328 ± 10% proc-vmstat.nr_file_pages
933039 ± 1% +106.5% 1927023 ± 7% proc-vmstat.nr_inactive_anon
2130483 ± 8% +72.3% 3671722 ± 12% proc-vmstat.nr_mapped
2273394 ± 7% +98.7% 4517832 ± 11% proc-vmstat.nr_shmem
136866 ± 2% +11.2% 152260 ± 1% proc-vmstat.nr_slab_reclaimable
22001919 ± 1% +16.8% 25708975 ± 3% proc-vmstat.numa_hit
545.33 ± 69% +10112.3% 55691 ± 0% proc-vmstat.numa_huge_pte_updates
22001915 ± 1% +16.8% 25708971 ± 3% proc-vmstat.numa_local
7627 ± 61% +8440.7% 651455 ± 19% proc-vmstat.numa_pages_migrated
345336 ± 55% +8160.0% 28524605 ± 0% proc-vmstat.numa_pte_updates
3147854 ± 5% +97.3% 6210234 ± 9% proc-vmstat.pgactivate
116038 ± 5% +25.3% 145341 ± 11% proc-vmstat.pgalloc_dma32
22120200 ± 1% +19.8% 26508050 ± 3% proc-vmstat.pgalloc_normal
21964870 ± 2% +19.0% 26129865 ± 3% proc-vmstat.pgfree
7627 ± 61% +8440.7% 651455 ± 19% proc-vmstat.pgmigrate_success
19.33 ± 51% +7087.9% 1389 ± 17% proc-vmstat.thp_deferred_split_page
Thanks,
Xiaolong
>
>--
> Kirill A. Shutemov