Re: hackbench vs select_idle_sibling; was: [tip:sched/core] sched/fair, cpumask: Export for_each_cpu_wrap()

From: Matt Fleming
Date: Fri May 19 2017 - 11:00:43 EST


On Wed, 17 May, at 12:53:50PM, Peter Zijlstra wrote:
>
> Please test..

Results are still coming in but things do look better with your patch
applied.

It does look like there's a regression when running hackbench in
process mode and when the CPUs are not fully utilised, e.g. check this
out:

hackbench-process-pipes
4.4.68 4.4.68 4.4.68 4.4.68
sles12-sp3 select-idle-cpu-aggressive for-each-cpu-wrap-fix latest-hackbench-fix
Amean 1 0.8853 ( 0.00%) 1.2160 (-37.35%) 1.0350 (-16.91%) 1.1853 (-33.89%)

This machine has 80 CPUs and that's a 40 process workload.

Here's the key:

select-idle-cpu-aggressive: 4c77b18cf8b7 ("sched/fair: Make select_idle_cpu() more aggressive")
for-each-cpu-wrap-fix: c743f0a5c50f ("sched/fair, cpumask: Export for_each_cpu_wrap()")
latest-hackbench-fix: this patch

But those results definitely look to be an exception. Here's the same
machine running the same number of tasks but with pthreads,

hackbench-thread-pipes
4.4.68 4.4.68 4.4.68 4.4.68
sles12-sp3 select-idle-cpu-aggressive for-each-cpu-wrap-fix latest-hackbench-fix
Amean 1 0.7427 ( 0.00%) 0.9760 (-31.42%) 1.1907 (-60.32%) 0.7643 ( -2.92%)

Nice win.