Re: [LKP] [vmstat] c3916839649: -63.9% will-it-scale.time.involuntary_context_switches, -17.6% will-it-scale.time.voluntary_context_switches, no primary result change

From: Michal Hocko
Date: Mon Feb 16 2015 - 11:27:19 EST


On Sun 15-02-15 15:55:19, Huang Ying wrote:
> FYI, we noticed the below changes on
>
> git://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git master
> commit c3916839649ddb3c0d541ce346360c4197898d52 ("vmstat: do not use deferrable delayed work for vmstat_update")
>
>
> testbox/testcase/testparams: wsm/will-it-scale/performance-malloc1
>
> 516245a5b8e7c56e c3916839649ddb3c0d541ce346
> ---------------- --------------------------
> 1213 ± 1% -63.9% 438 ± 6% will-it-scale.time.involuntary_context_switches
> 2978260 ± 4% -17.6% 2453255 ± 3% will-it-scale.time.voluntary_context_switches

Do I understand numbers correctly and the patch helped to reduce context
switches because vmstat hasn't interfered so much? If yes then I am not
sure how that would be possible because vmstat workers were called less
often before this patch.

> 242 ± 0% +3.3% 250 ± 1% will-it-scale.time.system_time

Hmm, the system time is increased but user time (below) has decreased
(+8s system -1s user). What was the Elapsed time?

I am not familiar with the sched_debug statistics and also not exactly
sure what is the intention of this report. Is this supposed to be a
regression?

> 1274586 ± 5% -84.6% 196270 ± 2% cpuidle.C6-NHM.usage
> 140438 ± 18% -72.4% 38803 ± 14% cpuidle.C1-NHM.usage
> 33943 ± 8% -72.9% 9203 ± 5% sched_debug.cfs_rq[5]:/.exec_clock
> 114238 ± 19% +72.3% 196836 ± 17% sched_debug.cpu#4.ttwu_local
> 308423 ± 12% -68.9% 95778 ± 1% sched_debug.cfs_rq[5]:/.min_vruntime
> 146477 ± 10% +48.7% 217860 ± 17% sched_debug.cpu#4.ttwu_count
> 64938 ± 16% -26.8% 47543 ± 4% sched_debug.cpu#0.ttwu_count
> 14242451 ± 14% -65.4% 4921778 ± 15% cpuidle.C1-NHM.time
> 37 ± 14% -52.7% 17 ± 31% sched_debug.cpu#5.cpu_load[4]
> 1213 ± 1% -63.9% 438 ± 6% time.involuntary_context_switches
> 61803 ± 13% -25.4% 46093 ± 4% sched_debug.cpu#0.sched_goidle
> 40 ± 13% -52.8% 19 ± 33% sched_debug.cpu#5.cpu_load[3]
> 46 ± 17% -52.2% 22 ± 39% sched_debug.cpu#5.cpu_load[2]
> 79 ± 10% +59.3% 126 ± 32% sched_debug.cpu#6.cpu_load[4]
> 73 ± 10% +54.9% 113 ± 33% sched_debug.cpu#6.cpu_load[3]
> 48956 ± 5% -46.1% 26390 ± 12% sched_debug.cpu#5.nr_load_updates
> 54 ± 26% -50.0% 27 ± 48% sched_debug.cpu#5.cpu_load[1]
> 327 ± 10% -44.1% 182 ± 8% sched_debug.cfs_rq[5]:/.tg_runnable_contrib
> 15032 ± 10% -43.9% 8426 ± 8% sched_debug.cfs_rq[5]:/.avg->runnable_avg_sum
> 32594 ± 9% +78.4% 58157 ± 0% sched_debug.cfs_rq[11]:/.exec_clock
> 285793 ± 11% +73.6% 496180 ± 3% sched_debug.cfs_rq[11]:/.min_vruntime
> 83 ± 15% -30.4% 57 ± 11% sched_debug.cpu#10.load
> 29 ± 12% +72.4% 50 ± 5% sched_debug.cpu#11.cpu_load[4]
> 31 ± 9% +59.8% 50 ± 6% sched_debug.cpu#11.cpu_load[3]
> 48 ± 15% +41.5% 69 ± 11% sched_debug.cfs_rq[9]:/.runnable_load_avg
> 68 ± 25% +41.9% 96 ± 20% sched_debug.cfs_rq[9]:/.load
> 81 ± 17% -26.5% 59 ± 12% sched_debug.cfs_rq[10]:/.load
> 521 ± 9% +43.4% 748 ± 6% cpuidle.POLL.usage
> 14974 ± 12% +46.0% 21864 ± 1% sched_debug.cfs_rq[11]:/.avg->runnable_avg_sum
> 327 ± 12% +45.8% 476 ± 1% sched_debug.cfs_rq[11]:/.tg_runnable_contrib
> 58930 ± 6% +41.2% 83196 ± 1% sched_debug.cpu#11.nr_load_updates
> 5326 ± 13% -18.5% 4343 ± 4% meminfo.AnonHugePages
> 222147 ± 7% +47.1% 326809 ± 30% sched_debug.cfs_rq[6]:/.spread0
> 149897 ± 12% +44.3% 216276 ± 17% sched_debug.cpu#4.sched_goidle
> 127451 ± 13% -24.2% 96598 ± 4% sched_debug.cpu#0.nr_switches
> 127599 ± 13% -24.2% 96740 ± 4% sched_debug.cpu#0.sched_count
> 36 ± 8% +42.1% 51 ± 8% sched_debug.cpu#11.cpu_load[2]
> 301690 ± 12% +43.9% 434016 ± 17% sched_debug.cpu#4.nr_switches
> 301787 ± 12% +43.8% 434096 ± 17% sched_debug.cpu#4.sched_count
> 116070 ± 3% -25.4% 86585 ± 9% sched_debug.cpu#11.ttwu_count
> 1824349 ± 4% +29.4% 2361351 ± 4% cpuidle.C3-NHM.usage
> 10615 ± 13% -17.4% 8766 ± 13% sched_debug.cfs_rq[3]:/.avg->runnable_avg_sum
> 230 ± 13% -17.4% 190 ± 13% sched_debug.cfs_rq[3]:/.tg_runnable_contrib
> 61 ± 14% -19.4% 49 ± 12% sched_debug.cfs_rq[10]:/.runnable_load_avg
> 99092 ± 10% -16.8% 82477 ± 0% sched_debug.cpu#10.sched_goidle
> 1175 ± 12% +18.8% 1397 ± 11% sched_debug.cpu#11.curr->pid
> 199498 ± 10% -16.9% 165794 ± 0% sched_debug.cpu#10.sched_count
> 199295 ± 10% -16.9% 165611 ± 0% sched_debug.cpu#10.nr_switches
> 326090 ± 12% -11.6% 288127 ± 9% sched_debug.cfs_rq[2]:/.min_vruntime
> 6.955e+08 ± 1% +25.2% 8.707e+08 ± 1% cpuidle.C3-NHM.time
> 2978260 ± 4% -17.6% 2453255 ± 3% time.voluntary_context_switches
> 1.06e+08 ± 3% -21.3% 83401246 ± 8% cpuidle.C1E-NHM.time
> 109454 ± 4% -19.9% 87649 ± 8% sched_debug.cpu#6.ttwu_count
> 48 ± 3% +17.6% 56 ± 10% sched_debug.cpu#9.cpu_load[2]
> 139443 ± 5% +15.4% 160850 ± 4% proc-vmstat.thp_fault_alloc
> 139441 ± 5% +15.4% 160848 ± 4% proc-vmstat.thp_split
> 220724 ± 2% -20.1% 176362 ± 9% sched_debug.cpu#11.nr_switches
> 220846 ± 2% -20.1% 176481 ± 9% sched_debug.cpu#11.sched_count
> 109644 ± 2% -20.0% 87693 ± 9% sched_debug.cpu#11.sched_goidle
> 46 ± 2% +17.3% 54 ± 8% sched_debug.cpu#9.cpu_load[3]
> 79492 ± 1% +12.9% 89782 ± 4% softirqs.SCHED
> 100431 ± 8% -17.2% 83150 ± 9% sched_debug.cpu#9.ttwu_count
> 726690 ± 2% +11.9% 813169 ± 5% sched_debug.cpu#4.avg_idle
> 91979 ± 0% +12.9% 103877 ± 12% sched_debug.cfs_rq[6]:/.exec_clock
> 201606 ± 5% -11.7% 177929 ± 5% sched_debug.cpu#6.nr_switches
> 201773 ± 5% -11.7% 178108 ± 5% sched_debug.cpu#6.sched_count
> 100423 ± 5% -11.8% 88575 ± 5% sched_debug.cpu#6.sched_goidle
> 40821 ± 2% +15.8% 47259 ± 5% softirqs.RCU
> 44 ± 3% +16.9% 52 ± 8% sched_debug.cpu#9.cpu_load[4]
> 9.56 ± 4% -11.6% 8.45 ± 1% time.user_time
> 306985 ± 12% -12.0% 270022 ± 4% sched_debug.cfs_rq[1]:/.min_vruntime
> 1.544e+09 ± 1% -10.0% 1.389e+09 ± 0% cpuidle.C6-NHM.time
> 115366 ± 1% +11.3% 128364 ± 10% sched_debug.cpu#6.nr_load_updates
> 22.81 ± 1% +10.8% 25.26 ± 1% turbostat.CPU%c3
> 25858 ± 3% -13.3% 22427 ± 2% vmstat.system.cs
>
> wsm: Westmere
> Memory: 6G
>
>
>
>
> time.involuntary_context_switches
>
> 1300 ++-------------------------------------------------------------------+
> 1200 *+*.. .*. .*. .*. .*.*.*..*.*
> | .*. *.*.*..*.*.*.. .*..* *.*..*.*.*. *.*..* |
> 1100 ++ *.* *.* |
> 1000 ++ |
> | |
> 900 ++ |
> 800 ++ |
> 700 ++ |
> | |
> 600 ++ |
> 500 ++ O O O |
> O O O O O O O O O O O O O |
> 400 ++O O O O O |
> 300 ++-------------------------------------------------------------------+
>
>
> cpuidle.C3-NHM.time
>
> 9.5e+08 ++----------------------------------------------------------------+
> | |
> 9e+08 ++ O |
> O O O O O O O O O O |
> | O O O O O O O O O O |
> 8.5e+08 ++ |
> | |
> 8e+08 ++ |
> | |
> 7.5e+08 ++ |
> | .* *.*. .*. *. |
> | .*.* *.*. + + *..* *.*. + * |
> 7e+08 *+ .*. + + * *..*.* + .*.*.*. .*..*. .*
> | * * *. * * |
> 6.5e+08 ++----------------------------------------------------------------+
>
>
> cpuidle.C6-NHM.time
>
> 1.6e+09 ++---------------------------------------------------------------+
> | *. *. * |
> |.*.*.. *.*. : * .* .* * : *.*. ::|
> 1.55e+09 *+ + *. : : * + .* + + + : *.*..* : :|
> | * *. : : + *..* * *..* + : *
> | * * * |
> 1.5e+09 ++ |
> | |
> 1.45e+09 ++ |
> | |
> | O O |
> 1.4e+09 ++ O O |
> O O O O O O O O O O O |
> | O O O O O |
> 1.35e+09 ++----------------------------O----------------------------------+
>
>
> cpuidle.C6-NHM.usage
>
> 1.4e+06 ++--------------------------------------------------------------*-+
> *.*. *.* .* .* *
> 1.2e+06 ++ *..*. + + .*.*. .*. .*. .*. + .*.*. .*.*. .*.*. |
> | * *. * *. *.*.* * *. * |
> 1e+06 ++ |
> | |
> 800000 ++ |
> | |
> 600000 ++ |
> | |
> 400000 ++ |
> | |
> 200000 O+O O O O O O O O O O O O O O O O O O O O |
> | |
> 0 ++----------------------------------------------------------------+
>
>
> will-it-scale.time.involuntary_context_switches
>
> 1300 ++-------------------------------------------------------------------+
> 1200 *+*.. .*. .*. .*. .*.*.*..*.*
> | .*. *.*.*..*.*.*.. .*..* *.*..*.*.*. *.*..* |
> 1100 ++ *.* *.* |
> 1000 ++ |
> | |
> 900 ++ |
> 800 ++ |
> 700 ++ |
> | |
> 600 ++ |
> 500 ++ O O O |
> O O O O O O O O O O O O O |
> 400 ++O O O O O |
> 300 ++-------------------------------------------------------------------+
>
>
> [*] bisect-good sample
> [O] bisect-bad sample
>
> To reproduce:
>
> apt-get install ruby
> git clone git://git.kernel.org/pub/scm/linux/kernel/git/wfg/lkp-tests.git
> cd lkp-tests
> bin/setup-local job.yaml # the job file attached in this email
> bin/run-local job.yaml
>
>
> Disclaimer:
> Results have been estimated based on internal Intel analysis and are provided
> for informational purposes only. Any difference in system hardware or software
> design or configuration may affect actual performance.
>
>
> Thanks,
> Huang, Ying
>

> ---
> testcase: will-it-scale
> default-monitors:
> wait: pre-test
> uptime:
> iostat:
> vmstat:
> numa-numastat:
> numa-vmstat:
> numa-meminfo:
> proc-vmstat:
> proc-stat:
> meminfo:
> slabinfo:
> interrupts:
> lock_stat:
> latency_stats:
> softirqs:
> bdi_dev_mapping:
> diskstats:
> nfsstat:
> cpuidle:
> cpufreq-stats:
> turbostat:
> pmeter:
> sched_debug:
> interval: 10
> default_watchdogs:
> watch-oom:
> watchdog:
> cpufreq_governor: performance
> commit: 7496ad672d0ed7177fefec9c70ece8a70bbd2dc1
> model: Westmere
> memory: 6G
> nr_hdd_partitions: 1
> hdd_partitions:
> swap_partitions:
> rootfs_partition:
> netconsole_port: 6667
> perf-profile:
> freq: 800
> will-it-scale:
> test: malloc1
> testbox: wsm
> tbox_group: wsm
> kconfig: x86_64-rhel
> enqueue_time: 2015-02-12 07:25:08.935926504 +08:00
> head_commit: 7496ad672d0ed7177fefec9c70ece8a70bbd2dc1
> base_commit: bfa76d49576599a4b9f9b7a71f23d73d6dcff735
> branch: next/master
> kernel: "/kernel/x86_64-rhel/7496ad672d0ed7177fefec9c70ece8a70bbd2dc1/vmlinuz-3.19.0-next-20150212-g7496ad6"
> user: lkp
> queue: cyclic
> rootfs: debian-x86_64-2015-02-07.cgz
> result_root: "/result/wsm/will-it-scale/performance-malloc1/debian-x86_64-2015-02-07.cgz/x86_64-rhel/7496ad672d0ed7177fefec9c70ece8a70bbd2dc1/0"
> job_file: "/lkp/scheduled/wsm/cyclic_will-it-scale-performance-malloc1-x86_64-rhel-HEAD-7496ad672d0ed7177fefec9c70ece8a70bbd2dc1-0-20150212-31440-ix9pa1.yaml"
> dequeue_time: 2015-02-12 16:35:45.778268771 +08:00
> nr_cpu: "$(nproc)"
> job_state: finished
> loadavg: 8.37 4.91 2.02 1/158 5629
> start_time: '1423730169'
> end_time: '1423730474'
> version: "/lkp/lkp/.src-20150212-162408"

> ./runtest.py malloc1 32 both 1 6 9 12

> _______________________________________________
> LKP mailing list
> LKP@xxxxxxxxxxxxxxx
>


--
Michal Hocko
SUSE Labs
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/