[PATCH v2 0/4] Improve newidle lb cost tracking and early abort

From: Vincent Guittot
Date: Fri Oct 15 2021 - 08:50:39 EST


This patchset updates newidle lb cost tracking and early abort:

The time spent running update_blocked_averages is now accounted in the 1st
sched_domain level. This time can be significant and move the cost of
newidle lb above the avg_idle time.

The decay of max_newidle_lb_cost is modified to start only when the field
has not been updated for a while. Recent update will not be decayed
immediatlybut only after a while.

The condition of an avg_idle lower than sysctl_sched_migration_cost has
been removed as the 500us value is quite large and prevent opportunity to
pull task on the newly idle CPU for at least 1st domain levels.

Monitoring sd->max_newidle_lb_cost on cpu0 of a Arm64 system
THX2 (2 nodes * 28 cores * 4 cpus) during the benchmarks gives the
following results:
min avg max
SMT: 1us 33us 273us - this one includes the update of blocked load
MC: 7us 49us 398us
NUMA: 10us 45us 158us


Some results for hackbench -l $LOOPS -g $group :
group tip/sched/core + this patchset
1 15.189(+/- 2%) 14.987(+/- 2%) +1%
4 4.336(+/- 3%) 4.322(+/- 5%) +0%
16 3.654(+/- 1%) 2.922(+/- 3%) +20%
32 3.209(+/- 1%) 2.919(+/- 3%) +9%
64 2.965(+/- 1%) 2.826(+/- 1%) +4%
128 2.954(+/- 1%) 2.993(+/- 8%) -1%
256 2.951(+/- 1%) 2.894(+/- 1%) +2%

tbench and reaim have not shown any difference

Change since v1:
- account the time spent in update_blocked_averages() in the 1st domain

- reduce number of call of sched_clock_cpu()

- change the way max_newidle_lb_cost is decayed. Peter suggested to use a
IIR but keeping a track of the current max value gave the best result

- removed the condition (this_rq->avg_idle < sysctl_sched_migration_cost)
as suggested by Peter

Vincent Guittot (4):
sched/fair: Account update_blocked_averages in newidle_balance cost
sched/fair: Skip update_blocked_averages if we are defering load
balance
sched/fair: Wait before decaying max_newidle_lb_cost
sched/fair: Remove sysctl_sched_migration_cost condition

include/linux/sched/topology.h | 2 +-
kernel/sched/fair.c | 29 ++++++++++++++++++-----------
kernel/sched/topology.c | 2 +-
3 files changed, 20 insertions(+), 13 deletions(-)

--
2.17.1