[PATCH 3/4 v2] sched/fair: minimize concurrent LBs between domain level

From: Vincent Guittot
Date: Mon Sep 21 2020 - 03:24:46 EST


sched domains tend to trigger simultaneously the load balance loop but
the larger domains often need more time to collect statistics. This
slowness makes the larger domain trying to detach tasks from a rq whereas
tasks already migrated somewhere else at a sub-domain level. This is not
a real problem for idle LB because the period of smaller domains will
increase with its CPUs being busy and this will let time for higher ones
to pulled tasks. But this becomes a problem when all CPUs are already busy
because all domains stay synced when they trigger their LB.

A simple way to minimize simultaneous LB of all domains is to decrement the
the busy interval by 1 jiffies. Because of the busy_factor, the interval of
larger domain will not be a multiple of smaller ones anymore.

Reviewed-by: Phil Auld <pauld@xxxxxxxxxx>
Signed-off-by: Vincent Guittot <vincent.guittot@xxxxxxxxxx>
---
kernel/sched/fair.c | 9 +++++++++
1 file changed, 9 insertions(+)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index d8320dc9d014..458702062d3b 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -9785,6 +9785,15 @@ get_sd_balance_interval(struct sched_domain *sd, int cpu_busy)

/* scale ms to jiffies */
interval = msecs_to_jiffies(interval);
+
+ /*
+ * Reduce likelihood of busy balancing at higher domains racing with
+ * balancing at lower domains by preventing their balancing periods
+ * from being multiples of each other.
+ */
+ if (cpu_busy)
+ interval -= 1;
+
interval = clamp(interval, 1UL, max_load_balance_interval);

return interval;
--
2.17.1