[tip: sched/core] sched/fair: Minimize concurrent LBs between domain level

From: tip-bot2 for Vincent Guittot
Date: Tue Sep 29 2020 - 03:57:02 EST


The following commit has been merged into the sched/core branch of tip:

Commit-ID: e4d32e4d5444977d8dc25fa98b3ce0a65544db8c
Gitweb: https://git.kernel.org/tip/e4d32e4d5444977d8dc25fa98b3ce0a65544db8c
Author: Vincent Guittot <vincent.guittot@xxxxxxxxxx>
AuthorDate: Mon, 21 Sep 2020 09:24:23 +02:00
Committer: Peter Zijlstra <peterz@xxxxxxxxxxxxx>
CommitterDate: Fri, 25 Sep 2020 14:23:26 +02:00

sched/fair: Minimize concurrent LBs between domain level

sched domains tend to trigger simultaneously the load balance loop but
the larger domains often need more time to collect statistics. This
slowness makes the larger domain trying to detach tasks from a rq whereas
tasks already migrated somewhere else at a sub-domain level. This is not
a real problem for idle LB because the period of smaller domains will
increase with its CPUs being busy and this will let time for higher ones
to pulled tasks. But this becomes a problem when all CPUs are already busy
because all domains stay synced when they trigger their LB.

A simple way to minimize simultaneous LB of all domains is to decrement the
the busy interval by 1 jiffies. Because of the busy_factor, the interval of
larger domain will not be a multiple of smaller ones anymore.

Signed-off-by: Vincent Guittot <vincent.guittot@xxxxxxxxxx>
Signed-off-by: Peter Zijlstra (Intel) <peterz@xxxxxxxxxxxxx>
Reviewed-by: Phil Auld <pauld@xxxxxxxxxx>
Link: https://lkml.kernel.org/r/20200921072424.14813-4-vincent.guittot@xxxxxxxxxx
---
kernel/sched/fair.c | 9 +++++++++
1 file changed, 9 insertions(+)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 5e3add3..24a5ee6 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -9790,6 +9790,15 @@ get_sd_balance_interval(struct sched_domain *sd, int cpu_busy)

/* scale ms to jiffies */
interval = msecs_to_jiffies(interval);
+
+ /*
+ * Reduce likelihood of busy balancing at higher domains racing with
+ * balancing at lower domains by preventing their balancing periods
+ * from being multiples of each other.
+ */
+ if (cpu_busy)
+ interval -= 1;
+
interval = clamp(interval, 1UL, max_load_balance_interval);

return interval;