[tip:sched/core] sched/fair: Set rq->rd->overload when misfit
From: tip-bot for Valentin Schneider
Date: Mon Sep 10 2018 - 06:18:03 EST
Commit-ID: 757ffdd705ee942fc8150b17942d968601d2a15b
Gitweb: https://git.kernel.org/tip/757ffdd705ee942fc8150b17942d968601d2a15b
Author: Valentin Schneider <valentin.schneider@xxxxxxx>
AuthorDate: Wed, 4 Jul 2018 11:17:47 +0100
Committer: Ingo Molnar <mingo@xxxxxxxxxx>
CommitDate: Mon, 10 Sep 2018 11:05:53 +0200
sched/fair: Set rq->rd->overload when misfit
Idle balance is a great opportunity to pull a misfit task. However,
there are scenarios where misfit tasks are present but idle balance is
prevented by the overload flag.
A good example of this is a workload of n identical tasks. Let's suppose
we have a 2+2 Arm big.LITTLE system. We then spawn 4 fairly
CPU-intensive tasks - for the sake of simplicity let's say they are just
CPU hogs, even when running on big CPUs.
They are identical tasks, so on an SMP system they should all end at
(roughly) the same time. However, in our case the LITTLE CPUs are less
performing than the big CPUs, so tasks running on the LITTLEs will have
a longer completion time.
This means that the big CPUs will complete their work earlier, at which
point they should pull the tasks from the LITTLEs. What we want to
happen is summarized as follows:
a,b,c,d are our CPU-hogging tasks _ signifies idling
LITTLE_0 | a a a a _ _
LITTLE_1 | b b b b _ _
---------|-------------
big_0 | c c c c a a
big_1 | d d d d b b
^
^
Tasks end on the big CPUs, idle balance happens
and the misfit tasks are pulled straight away
This however won't happen, because currently the overload flag is only
set when there is any CPU that has more than one runnable task - which
may very well not be the case here if our CPU-hogging workload is all
there is to run.
As such, this commit sets the overload flag in update_sg_lb_stats when
a group is flagged as having a misfit task.
Signed-off-by: Valentin Schneider <valentin.schneider@xxxxxxx>
Signed-off-by: Morten Rasmussen <morten.rasmussen@xxxxxxx>
Signed-off-by: Peter Zijlstra (Intel) <peterz@xxxxxxxxxxxxx>
Cc: Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx>
Cc: Peter Zijlstra <peterz@xxxxxxxxxxxxx>
Cc: Thomas Gleixner <tglx@xxxxxxxxxxxxx>
Cc: dietmar.eggemann@xxxxxxx
Cc: gaku.inami.xh@xxxxxxxxxxx
Cc: vincent.guittot@xxxxxxxxxx
Link: http://lkml.kernel.org/r/1530699470-29808-10-git-send-email-morten.rasmussen@xxxxxxx
Signed-off-by: Ingo Molnar <mingo@xxxxxxxxxx>
---
kernel/sched/fair.c | 6 ++++--
kernel/sched/sched.h | 6 +++++-
2 files changed, 9 insertions(+), 3 deletions(-)
diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index d9c4e97bfebd..8b228c5b3eb4 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -7793,7 +7793,7 @@ static bool update_nohz_stats(struct rq *rq, bool force)
* @load_idx: Load index of sched_domain of this_cpu for load calc.
* @local_group: Does group contain this_cpu.
* @sgs: variable to hold the statistics for this group.
- * @overload: Indicate more than one runnable task for any CPU.
+ * @overload: Indicate pullable load (e.g. >1 runnable task).
*/
static inline void update_sg_lb_stats(struct lb_env *env,
struct sched_group *group, int load_idx,
@@ -7837,8 +7837,10 @@ static inline void update_sg_lb_stats(struct lb_env *env,
sgs->idle_cpus++;
if (env->sd->flags & SD_ASYM_CPUCAPACITY &&
- sgs->group_misfit_task_load < rq->misfit_task_load)
+ sgs->group_misfit_task_load < rq->misfit_task_load) {
sgs->group_misfit_task_load = rq->misfit_task_load;
+ *overload = 1;
+ }
}
/* Adjust by relative CPU capacity of the group */
diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h
index 938063639793..85b3a2bf6c2b 100644
--- a/kernel/sched/sched.h
+++ b/kernel/sched/sched.h
@@ -715,7 +715,11 @@ struct root_domain {
cpumask_var_t span;
cpumask_var_t online;
- /* Indicate more than one runnable task for any CPU */
+ /*
+ * Indicate pullable load on at least one CPU, e.g:
+ * - More than one runnable task
+ * - Running task is misfit
+ */
int overload;
/*