[tip: sched/core] sched/fair: Use load instead of runnable load in load_balance()

From: tip-bot2 for Vincent Guittot
Date: Mon Oct 21 2019 - 05:13:53 EST


The following commit has been merged into the sched/core branch of tip:

Commit-ID: b0fb1eb4f04ae4768231b9731efb1134e22053a4
Gitweb: https://git.kernel.org/tip/b0fb1eb4f04ae4768231b9731efb1134e22053a4
Author: Vincent Guittot <vincent.guittot@xxxxxxxxxx>
AuthorDate: Fri, 18 Oct 2019 15:26:33 +02:00
Committer: Ingo Molnar <mingo@xxxxxxxxxx>
CommitterDate: Mon, 21 Oct 2019 09:40:54 +02:00

sched/fair: Use load instead of runnable load in load_balance()

'runnable load' was originally introduced to take into account the case
where blocked load biases the load balance decision which was selecting
underutilized groups with huge blocked load whereas other groups were
overloaded.

The load is now only used when groups are overloaded. In this case,
it's worth being conservative and taking into account the sleeping
tasks that might wake up on the CPU.

Signed-off-by: Vincent Guittot <vincent.guittot@xxxxxxxxxx>
Cc: Ben Segall <bsegall@xxxxxxxxxx>
Cc: Dietmar Eggemann <dietmar.eggemann@xxxxxxx>
Cc: Juri Lelli <juri.lelli@xxxxxxxxxx>
Cc: Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx>
Cc: Mel Gorman <mgorman@xxxxxxx>
Cc: Mike Galbraith <efault@xxxxxx>
Cc: Morten.Rasmussen@xxxxxxx
Cc: Peter Zijlstra <peterz@xxxxxxxxxxxxx>
Cc: Steven Rostedt <rostedt@xxxxxxxxxxx>
Cc: Thomas Gleixner <tglx@xxxxxxxxxxxxx>
Cc: hdanton@xxxxxxxx
Cc: parth@xxxxxxxxxxxxx
Cc: pauld@xxxxxxxxxx
Cc: quentin.perret@xxxxxxx
Cc: riel@xxxxxxxxxxx
Cc: srikar@xxxxxxxxxxxxxxxxxx
Cc: valentin.schneider@xxxxxxx
Link: https://lkml.kernel.org/r/1571405198-27570-7-git-send-email-vincent.guittot@xxxxxxxxxx
Signed-off-by: Ingo Molnar <mingo@xxxxxxxxxx>
---
kernel/sched/fair.c | 24 ++++++++++++++----------
1 file changed, 14 insertions(+), 10 deletions(-)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 4e7396c..e6a3db0 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -5375,6 +5375,11 @@ static unsigned long cpu_runnable_load(struct rq *rq)
return cfs_rq_runnable_load_avg(&rq->cfs);
}

+static unsigned long cpu_load(struct rq *rq)
+{
+ return cfs_rq_load_avg(&rq->cfs);
+}
+
static unsigned long capacity_of(int cpu)
{
return cpu_rq(cpu)->cpu_capacity;
@@ -8049,7 +8054,7 @@ static inline void update_sg_lb_stats(struct lb_env *env,
if ((env->flags & LBF_NOHZ_STATS) && update_nohz_stats(rq, false))
env->flags |= LBF_NOHZ_AGAIN;

- sgs->group_load += cpu_runnable_load(rq);
+ sgs->group_load += cpu_load(rq);
sgs->group_util += cpu_util(i);
sgs->sum_h_nr_running += rq->cfs.h_nr_running;

@@ -8507,7 +8512,7 @@ static struct sched_group *find_busiest_group(struct lb_env *env)
init_sd_lb_stats(&sds);

/*
- * Compute the various statistics relavent for load balancing at
+ * Compute the various statistics relevant for load balancing at
* this level.
*/
update_sd_lb_stats(env, &sds);
@@ -8667,11 +8672,10 @@ static struct rq *find_busiest_queue(struct lb_env *env,
switch (env->migration_type) {
case migrate_load:
/*
- * When comparing with load imbalance, use
- * cpu_runnable_load() which is not scaled with the CPU
- * capacity.
+ * When comparing with load imbalance, use cpu_load()
+ * which is not scaled with the CPU capacity.
*/
- load = cpu_runnable_load(rq);
+ load = cpu_load(rq);

if (nr_running == 1 && load > env->imbalance &&
!check_cpu_capacity(rq, env->sd))
@@ -8679,10 +8683,10 @@ static struct rq *find_busiest_queue(struct lb_env *env,

/*
* For the load comparisons with the other CPUs,
- * consider the cpu_runnable_load() scaled with the CPU
- * capacity, so that the load can be moved away from
- * the CPU that is potentially running at a lower
- * capacity.
+ * consider the cpu_load() scaled with the CPU
+ * capacity, so that the load can be moved away
+ * from the CPU that is potentially running at a
+ * lower capacity.
*
* Thus we're looking for max(load_i / capacity_i),
* crosswise multiplication to rid ourselves of the