Re: [sched] 23f0d2093c: -12.6% regression on sparse file copy

From: Joonsoo Kim
Date: Sun Jan 05 2014 - 19:30:49 EST


On Sun, Jan 05, 2014 at 05:04:56PM +0800, fengguang.wu@xxxxxxxxx wrote:
> Hi Joonsoo,
>
> We noticed the below changes for commit 23f0d2093c ("sched: Factor out
> code to should_we_balance()") in test vm-scalability/300s-lru-file-readtwice

Hello, Fengguang.

There was a mistake in this patch and there was a fix and it was already merged
into mainline.

Could you test again with the commit (b0cff9d sched: Fix load balancing
performance regression in should_we_balance())?

Thanks.

>
> 95a79b805b935f4 23f0d2093c789e612185180c4
> --------------- -------------------------
> ==> 4.45 ~ 5% +1777.7% 83.60 ~ 5% vm-scalability.stddev
> ==> 14966511 ~ 0% -12.6% 13084545 ~ 2% vm-scalability.throughput
> 38 ~ 9% +406.3% 193 ~ 7% proc-vmstat.kswapd_low_wmark_hit_quickly
> 610823 ~ 0% -41.4% 357990 ~ 0% softirqs.SCHED
> 5.424e+08 ~ 0% -38.5% 3.338e+08 ~ 6% proc-vmstat.pgdeactivate
> 4.68e+08 ~ 0% -37.5% 2.924e+08 ~ 6% proc-vmstat.pgrefill_normal
> 5.549e+08 ~ 0% -37.1% 3.491e+08 ~ 6% proc-vmstat.pgactivate
> 14938509 ~ 1% +27.0% 18974176 ~ 1% vmstat.memory.free
> 978771 ~ 1% +23.9% 1212704 ~ 3% numa-vmstat.node2.nr_free_pages
> 3747434 ~ 0% +21.7% 4560196 ~ 2% proc-vmstat.nr_free_pages
> ==> 1.353e+08 ~ 0% +18.8% 1.607e+08 ~ 0% proc-vmstat.numa_foreign
> 1.353e+08 ~ 0% +18.8% 1.607e+08 ~ 0% proc-vmstat.numa_miss
> 1.353e+08 ~ 0% +18.8% 1.607e+08 ~ 0% proc-vmstat.numa_other
> 3936842 ~ 1% +22.2% 4812045 ~ 4% numa-meminfo.node2.MemFree
> 21803812 ~ 0% +17.7% 25661536 ~ 4% numa-vmstat.node3.numa_foreign
> 73701524 ~ 0% +15.0% 84769542 ~ 0% proc-vmstat.pgscan_direct_dma32
> 73700683 ~ 0% +15.0% 84768687 ~ 0% proc-vmstat.pgsteal_direct_dma32
> 3.101e+08 ~ 0% +11.2% 3.448e+08 ~ 0% proc-vmstat.pgsteal_direct_normal
> 3.103e+08 ~ 0% +11.2% 3.449e+08 ~ 0% proc-vmstat.pgscan_direct_normal
> 45613907 ~ 0% +12.6% 51342974 ~ 3% numa-vmstat.node0.numa_other
> 795639 ~ 0% -48.6% 409113 ~13% time.voluntary_context_switches
> 375 ~ 0% +6.1% 398 ~ 0% time.elapsed_time
> 9427 ~ 0% -5.8% 8880 ~ 0% time.percent_of_cpu_this_job_got
>
> The test case basically does
>
> for i in `seq 1 $nr_cpu`
> do
> create_sparse_file huge-$i
> dd if=huge-$i of=/dev/null &
> dd if=huge-$i of=/dev/null &
> done
>
> where nr_cpu=120 (test box is a 4-socket ivybridge system).
>
> The change looks stable, each point below is a sample run:
>
> vm-scalability.stddev
>
> 120 ++-------------------------------------------------------------------+
> | |
> 100 ++ * * |
> | *.*** : ** : * * * * * |
> ** * *.** * : * :*.* :: .* : : * :* * : .* : .* * .**|
> 80 ++ * * *. : * *: ** : :: : * :.* * * * ** : :* *
> | * * : *** * * * :** |
> 60 ++ * * |
> | |
> 40 ++ |
> | |
> | |
> 20 ++ |
> | O OO OO OOO O OO O |
> 0 OO--O--O------OO----OO-----------------------------------------------+
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@xxxxxxxxxxxxxxx
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/