[PATCH] sched/fair : Improve update_sd_pick_busiest for spare capacity case

From: Vincent Guittot
Date: Fri Dec 20 2019 - 06:05:00 EST


Similarly to calculate_imbalance() and find_busiest_group(), using the
number of idle CPUs when there is only 1 CPU in the group is not efficient
because we can't make a difference between a CPU running 1 task and a CPU
running dozens of small tasks competing for the same CPU but not enough
to overload it. More generally speaking, we should use the number of
running tasks when there is the same number of idle CPUs in a group instead
of blindly select the 1st one.

When the groups have spare capacity and the same number of idle CPUs, we
compare the number of running tasks to select the busiest group.

Signed-off-by: Vincent Guittot <vincent.guittot@xxxxxxxxxx>
---
kernel/sched/fair.c | 14 +++++++++-----
1 file changed, 9 insertions(+), 5 deletions(-)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 280d54ccb4be..808bba8c9f6d 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -8162,14 +8162,18 @@ static bool update_sd_pick_busiest(struct lb_env *env,

case group_has_spare:
/*
- * Select not overloaded group with lowest number of
- * idle cpus. We could also compare the spare capacity
- * which is more stable but it can end up that the
- * group has less spare capacity but finally more idle
+ * Select not overloaded group with lowest number of idle cpus
+ * and highest number of running tasks. We could also compare
+ * the spare capacity which is more stable but it can end up
+ * that the group has less spare capacity but finally more idle
* CPUs which means less opportunity to pull tasks.
*/
- if (sgs->idle_cpus >= busiest->idle_cpus)
+ if (sgs->idle_cpus > busiest->idle_cpus)
return false;
+ else if ((sgs->idle_cpus == busiest->idle_cpus) &&
+ (sgs->sum_nr_running <= busiest->sum_nr_running))
+ return false;
+
break;
}

--
2.7.4