[tip: sched/core] sched/fair : Improve update_sd_pick_busiest for spare capacity case

From: tip-bot2 for Vincent Guittot
Date: Fri Jan 17 2020 - 05:09:01 EST


The following commit has been merged into the sched/core branch of tip:

Commit-ID: 5f68eb19b5716f8cf3ccfa833cffd1522813b0e8
Gitweb: https://git.kernel.org/tip/5f68eb19b5716f8cf3ccfa833cffd1522813b0e8
Author: Vincent Guittot <vincent.guittot@xxxxxxxxxx>
AuthorDate: Fri, 20 Dec 2019 12:04:53 +01:00
Committer: Peter Zijlstra <peterz@xxxxxxxxxxxxx>
CommitterDate: Fri, 17 Jan 2020 10:19:19 +01:00

sched/fair : Improve update_sd_pick_busiest for spare capacity case

Similarly to calculate_imbalance() and find_busiest_group(), using the
number of idle CPUs when there is only 1 CPU in the group is not efficient
because we can't make a difference between a CPU running 1 task and a CPU
running dozens of small tasks competing for the same CPU but not enough
to overload it. More generally speaking, we should use the number of
running tasks when there is the same number of idle CPUs in a group instead
of blindly select the 1st one.

When the groups have spare capacity and the same number of idle CPUs, we
compare the number of running tasks to select the busiest group.

Signed-off-by: Vincent Guittot <vincent.guittot@xxxxxxxxxx>
Signed-off-by: Peter Zijlstra (Intel) <peterz@xxxxxxxxxxxxx>
Link: https://lkml.kernel.org/r/1576839893-26930-1-git-send-email-vincent.guittot@xxxxxxxxxx
---
kernel/sched/fair.c | 14 +++++++++-----
1 file changed, 9 insertions(+), 5 deletions(-)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 2d170b5..35c1057 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -8181,14 +8181,18 @@ static bool update_sd_pick_busiest(struct lb_env *env,

case group_has_spare:
/*
- * Select not overloaded group with lowest number of
- * idle cpus. We could also compare the spare capacity
- * which is more stable but it can end up that the
- * group has less spare capacity but finally more idle
+ * Select not overloaded group with lowest number of idle cpus
+ * and highest number of running tasks. We could also compare
+ * the spare capacity which is more stable but it can end up
+ * that the group has less spare capacity but finally more idle
* CPUs which means less opportunity to pull tasks.
*/
- if (sgs->idle_cpus >= busiest->idle_cpus)
+ if (sgs->idle_cpus > busiest->idle_cpus)
return false;
+ else if ((sgs->idle_cpus == busiest->idle_cpus) &&
+ (sgs->sum_nr_running <= busiest->sum_nr_running))
+ return false;
+
break;
}