Re: [PATCH v2 00/13] sched: Clean-ups and asymmetric cpu capacity support

From: Vincent Guittot
Date: Wed Jul 13 2016 - 08:06:57 EST


Hi Morten,

On 22 June 2016 at 19:03, Morten Rasmussen <morten.rasmussen@xxxxxxx> wrote:
> Hi,
>
> The scheduler is currently not doing much to help performance on systems with
> asymmetric compute capacities (read ARM big.LITTLE). This series improves the
> situation with a few tweaks mainly to the task wake-up path that considers
> compute capacity at wake-up and not just whether a cpu is idle for these
> systems. This gives us consistent, and potentially higher, throughput in
> partially utilized scenarios. SMP behaviour and performance should be
> unaffected.
>
> Test 0:
> for i in `seq 1 10`; \
> do sysbench --test=cpu --max-time=3 --num-threads=1 run; \
> done \
> | awk '{if ($4=="events:") {print $5; sum +=$5; runs +=1}} \
> END {print "Average events: " sum/runs}'
>
> Target: ARM TC2 (2xA15+3xA7)
>
> (Higher is better)
> tip: Average events: 146.9
> patch: Average events: 217.9
>
> Test 1:
> perf stat --null --repeat 10 -- \
> perf bench sched messaging -g 50 -l 5000
>
> Target: Intel IVB-EP (2*10*2)
>
> tip: 4.861970420 seconds time elapsed ( +- 1.39% )
> patch: 4.886204224 seconds time elapsed ( +- 0.75% )
>
> Target: ARM TC2 A7-only (3xA7) (-l 1000)
>
> tip: 61.485682596 seconds time elapsed ( +- 0.07% )
> patch: 62.667950130 seconds time elapsed ( +- 0.36% )
>
> More analysis:
>
> Statistics from mixed periodic task workload (rt-app) containing both
> big and little task, single run on ARM TC2:
>
> tu = Task utilization big/little
> pcpu = Previous cpu big/little
> tcpu = This (waker) cpu big/little
> dl = New cpu is little
> db = New cpu is big
> sis = New cpu chosen by select_idle_sibling()
> figc = New cpu chosen by find_idlest_*()
> ww = wake_wide(task) count for figc wakeups
> bw = sd_flag & SD_BALANCE_WAKE (non-fork/exec wake)
> for figc wakeups
>
> case tu pcpu tcpu dl db sis figc ww bw
> 1 l l l 122 68 28 162 161 161
> 2 l l b 11 4 0 15 15 15
> 3 l b l 0 252 8 244 244 244
> 4 l b b 36 1928 711 1253 1016 1016
> 5 b l l 5 19 0 24 22 24
> 6 b l b 5 1 0 6 0 6
> 7 b b l 0 31 0 31 31 31
> 8 b b b 1 194 109 86 59 59
> --------------------------------------------------
> 180 2497 856 1821

I'm not sure to know how to interpret all these statistics

>
> Cases 1-4 + 8 are fine to be served by select_idle_sibling() as both
> this_cpu and prev_cpu are suitable cpus for the task. However, as the
> figc column reveals, those cases are often served by find_idlest_*()
> anyway due to wake_wide() sending the wakeup that way when
> SD_BALANCE_WAKE is set on the sched_domains.
>
> Pulling in the wakee_flip patch (dropped in v2) from v1 shifts a
> significant share of the wakeups to sis from figc:
>
> case tu pcpu tcpu dl db sis figc ww bw
> 1 l l l 537 8 537 8 6 6
> 2 l l b 49 11 32 28 28 28
> 3 l b l 4 323 322 5 5 5
> 4 l b b 1 1910 1209 702 458 456
> 5 b l l 0 5 0 5 1 5
> 6 b l b 0 0 0 0 0 0
> 7 b b l 0 32 0 32 2 32
> 8 b b b 0 198 168 30 13 13
> --------------------------------------------------
> 591 2487 2268 810
>
> Notes:
>
> Active migration of tasks away from small capacity cpus isn't addressed
> in this set although it is necessary for consistent throughput in other
> scenarios on asymmetric cpu capacity systems.
>
> The infrastructure to enable capacity awareness for arm64 is not provided here
> but will be based on Juri's DT bindings patch set [1]. A combined preview
> branch is available [2].
>
> [1] https://lkml.org/lkml/2016/6/15/291
> [2] git://linux-arm.org/linux-power.git capacity_awareness_v2_arm64_v1
>
> Patch 1-3: Generic fixes and clean-ups.
> Patch 4-11: Improve capacity awareness.
> Patch 11-12: Arch features for arm to enable asymmetric capacity support.
>
> v2:
>
> - Dropped patch ignoring wakee_flips for pid=0 for now as we can not
> distinguish cpu time processing irqs from idle time.
>
> - Dropped disabling WAKE_AFFINE as suggested by Vincent to allow more
> scenarios to use fast-path (select_idle_sibling()). Asymmetric wake
> conditions adjusted accordingly.
>
> - Changed use of new SD_ASYM_CPUCAPACITY slightly. Now enables
> SD_BALANCE_WAKE.
>
> - Minor clean-ups and rebased to more recent tip/sched/core.
>
> v1: https://lkml.org/lkml/2014/5/23/621
>
> Dietmar Eggemann (1):
> sched: Store maximum per-cpu capacity in root domain
>
> Morten Rasmussen (12):
> sched: Fix power to capacity renaming in comment
> sched/fair: Consistent use of prev_cpu in wakeup path
> sched/fair: Optimize find_idlest_cpu() when there is no choice
> sched: Introduce SD_ASYM_CPUCAPACITY sched_domain topology flag
> sched: Enable SD_BALANCE_WAKE for asymmetric capacity systems
> sched/fair: Let asymmetric cpu configurations balance at wake-up
> sched/fair: Compute task/cpu utilization at wake-up more correctly
> sched/fair: Consider spare capacity in find_idlest_group()
> sched: Add per-cpu max capacity to sched_group_capacity
> sched/fair: Avoid pulling tasks from non-overloaded higher capacity
> groups
> arm: Set SD_ASYM_CPUCAPACITY for big.LITTLE platforms
> arm: Update arch_scale_cpu_capacity() to reflect change to define
>
> arch/arm/include/asm/topology.h | 5 +
> arch/arm/kernel/topology.c | 25 ++++-
> include/linux/sched.h | 3 +-
> kernel/sched/core.c | 21 +++-
> kernel/sched/fair.c | 212 +++++++++++++++++++++++++++++++++++-----
> kernel/sched/sched.h | 5 +-
> 6 files changed, 241 insertions(+), 30 deletions(-)
>
> --
> 1.9.1
>