Re: [lkp] [mm/vmstat] 6cdb18ad98: -8.5% will-it-scale.per_thread_ops

From: Huang\, Ying
Date: Thu Jan 21 2016 - 01:47:55 EST


Heiko Carstens <heiko.carstens@xxxxxxxxxx> writes:

> On Wed, Jan 06, 2016 at 11:20:55AM +0800, kernel test robot wrote:
>> FYI, we noticed the below changes on
>>
>> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git master
>> commit 6cdb18ad98a49f7e9b95d538a0614cde827404b8 ("mm/vmstat: fix overflow in mod_zone_page_state()")
>>
>>
>> =========================================================================================
>> compiler/cpufreq_governor/kconfig/rootfs/tbox_group/test/testcase:
>> gcc-4.9/performance/x86_64-rhel/debian-x86_64-2015-02-07.cgz/ivb42/pread1/will-it-scale
>>
>> commit:
>> cc28d6d80f6ab494b10f0e2ec949eacd610f66e3
>> 6cdb18ad98a49f7e9b95d538a0614cde827404b8
>>
>> cc28d6d80f6ab494 6cdb18ad98a49f7e9b95d538a0
>> ---------------- --------------------------
>> %stddev %change %stddev
>> \ | \
>> 2733943 . 0% -8.5% 2502129 . 0% will-it-scale.per_thread_ops
>> 3410 . 0% -2.0% 3343 . 0% will-it-scale.time.system_time
>> 340.08 . 0% +19.7% 406.99 . 0% will-it-scale.time.user_time
>> 69882822 . 2% -24.3% 52926191 . 5% cpuidle.C1-IVT.time
>> 340.08 . 0% +19.7% 406.99 . 0% time.user_time
>> 491.25 . 6% -17.7% 404.25 . 7% numa-vmstat.node0.nr_alloc_batch
>> 2799 . 20% -36.6% 1776 . 0% numa-vmstat.node0.nr_mapped
>> 630.00 .140% +244.4% 2169 . 1% numa-vmstat.node1.nr_inactive_anon
>
> Hmm... this is odd. I did review all callers of mod_zone_page_state() and
> couldn't find anything obvious that would go wrong after the int -> long
> change.
>
> I also tried the "pread1_threads" test case from
> https://github.com/antonblanchard/will-it-scale.git
>
> However the results seem to vary a lot after a reboot(!), at least on s390.
>
> So I'm not sure if this is really a regression.

Most part of the regression is restored for v4.4. But because the changes are
like "V", it is hard to bisect.

=========================================================================================
compiler/cpufreq_governor/kconfig/mode/nr_task/rootfs/tbox_group/test/testcase:
gcc-4.9/performance/x86_64-rhel/thread/24/debian-x86_64-2015-02-07.cgz/ivb42/pread1/will-it-scale

commit:
cc28d6d80f6ab494b10f0e2ec949eacd610f66e3
6cdb18ad98a49f7e9b95d538a0614cde827404b8
v4.4

cc28d6d80f6ab494 6cdb18ad98a49f7e9b95d538a0 v4.4
---------------- -------------------------- --------------------------
%stddev %change %stddev %change %stddev
\ | \ | \
3083436 Â 0% -9.6% 2788374 Â 0% -3.7% 2970130 Â 0% will-it-scale.per_thread_ops
6447 Â 0% -2.2% 6308 Â 0% -0.3% 6425 Â 0% will-it-scale.time.system_time
776.90 Â 0% +17.9% 915.71 Â 0% +2.9% 799.12 Â 0% will-it-scale.time.user_time
316177 Â 4% -4.6% 301616 Â 3% -10.3% 283563 Â 3% softirqs.RCU
776.90 Â 0% +17.9% 915.71 Â 0% +2.9% 799.12 Â 0% time.user_time
777.33 Â 7% +20.8% 938.67 Â 7% +7.5% 836.00 Â 8% slabinfo.blkdev_requests.active_objs
777.33 Â 7% +20.8% 938.67 Â 7% +7.5% 836.00 Â 8% slabinfo.blkdev_requests.num_objs
74313962 Â 44% -16.5% 62053062 Â 41% -49.9% 37246967 Â 8% cpuidle.C1-IVT.time
43381614 Â 79% +24.4% 53966568 Â111% +123.9% 97135791 Â 33% cpuidle.C1E-IVT.time
97.67 Â 36% +95.2% 190.67 Â 63% +122.5% 217.33 Â 41% cpuidle.C3-IVT.usage
3679437 Â 69% -100.0% 0.00 Â -1% -100.0% 0.00 Â -1% latency_stats.avg.nfs_wait_on_request.nfs_updatepage.nfs_write_end.generic_perform_write.__generic_file_write_iter.generic_file_write_iter.nfs_file_write.__vfs_write.vfs_write.SyS_write.entry_SYSCALL_64_fastpath
5177475 Â 82% -100.0% 0.00 Â -1% -100.0% 0.00 Â -1% latency_stats.max.nfs_wait_on_request.nfs_updatepage.nfs_write_end.generic_perform_write.__generic_file_write_iter.generic_file_write_iter.nfs_file_write.__vfs_write.vfs_write.SyS_write.entry_SYSCALL_64_fastpath
11726393 Â112% -100.0% 0.00 Â -1% -100.0% 0.00 Â -1% latency_stats.sum.nfs_wait_on_request.nfs_updatepage.nfs_write_end.generic_perform_write.__generic_file_write_iter.generic_file_write_iter.nfs_file_write.__vfs_write.vfs_write.SyS_write.entry_SYSCALL_64_fastpath
178.07 Â 0% -1.3% 175.79 Â 0% -0.8% 176.62 Â 0% turbostat.CorWatt
0.20 Â 2% -16.9% 0.16 Â 18% -11.9% 0.17 Â 17% turbostat.Pkg%pc6
207.38 Â 0% -1.1% 205.13 Â 0% -0.7% 205.99 Â 0% turbostat.PkgWatt
6889 Â 33% -49.2% 3497 Â 86% -19.4% 5552 Â 27% numa-vmstat.node0.nr_active_anon
483.33 Â 29% -32.3% 327.00 Â 48% +0.1% 483.67 Â 29% numa-vmstat.node0.nr_page_table_pages
27536 Â 96% +10.9% 30535 Â 78% +148.5% 68418 Â 2% numa-vmstat.node0.numa_other
214.00 Â 11% +18.1% 252.67 Â 4% +2.8% 220.00 Â 9% numa-vmstat.node1.nr_kernel_stack
370.67 Â 38% +42.0% 526.33 Â 30% -0.2% 370.00 Â 39% numa-vmstat.node1.nr_page_table_pages
61177 Â 43% -5.2% 57976 Â 41% -66.3% 20644 Â 10% numa-vmstat.node1.numa_other
78172 Â 13% -16.1% 65573 Â 18% -5.8% 73626 Â 9% numa-meminfo.node0.Active
27560 Â 33% -49.2% 14006 Â 86% -19.4% 22203 Â 27% numa-meminfo.node0.Active(anon)
3891 Â 58% -38.1% 2407 Â100% -58.8% 1604 Â110% numa-meminfo.node0.AnonHugePages
1934 Â 29% -32.3% 1309 Â 48% +0.1% 1936 Â 29% numa-meminfo.node0.PageTables
63139 Â 17% +19.8% 75670 Â 16% +6.0% 66937 Â 10% numa-meminfo.node1.Active
3432 Â 11% +18.0% 4049 Â 4% +2.8% 3527 Â 9% numa-meminfo.node1.KernelStack
1483 Â 38% +42.0% 2106 Â 30% -0.2% 1481 Â 39% numa-meminfo.node1.PageTables
1.47 Â 1% -11.8% 1.30 Â 2% -7.0% 1.37 Â 3% perf-profile.cycles-pp.___might_sleep.__might_sleep.find_lock_entry.shmem_getpage_gfp.shmem_file_read_iter
2.00 Â 2% -11.3% 1.78 Â 2% -7.2% 1.86 Â 2% perf-profile.cycles-pp.__might_sleep.find_lock_entry.shmem_getpage_gfp.shmem_file_read_iter.__vfs_read
2.30 Â 4% +33.6% 3.07 Â 0% -1.9% 2.26 Â 1% perf-profile.cycles-pp.atime_needs_update.touch_atime.shmem_file_read_iter.__vfs_read.vfs_read
1.05 Â 1% -27.7% 0.76 Â 1% -8.0% 0.96 Â 0% perf-profile.cycles-pp.current_fs_time.atime_needs_update.touch_atime.shmem_file_read_iter.__vfs_read
2.21 Â 3% -11.9% 1.94 Â 2% -9.4% 2.00 Â 0% perf-profile.cycles-pp.fput.entry_SYSCALL_64_fastpath
0.78 Â 2% +38.5% 1.08 Â 2% +23.1% 0.96 Â 3% perf-profile.cycles-pp.fsnotify.vfs_read.sys_pread64.entry_SYSCALL_64_fastpath
2.87 Â 7% +42.6% 4.09 Â 1% -0.3% 2.86 Â 2% perf-profile.cycles-pp.touch_atime.shmem_file_read_iter.__vfs_read.vfs_read.sys_pread64
6.68 Â 2% -7.3% 6.19 Â 1% -6.7% 6.23 Â 1% perf-profile.cycles-pp.unlock_page.shmem_file_read_iter.__vfs_read.vfs_read.sys_pread64


Best Regards,
Huang, Ying