[PATCH v8 4/4] sched/fair: Don't double balance_interval for migrate_misfit

From: Qais Yousef
Date: Sat Mar 23 2024 - 20:46:54 EST


It is not necessarily an indication of the system being busy and
requires a backoff of the load balancer activities. But pushing it high
could mean generally delaying other misfit activities or other type of
imbalances.

Also don't pollute nr_balance_failed because of misfit failures. The
value is used for enabling cache hot migration and in migrate_util/load
types. None of which should be impacted (skewed) by misfit failures.

Reviewed-by: Vincent Guittot <vincent.guittot@xxxxxxxxxx>
Signed-off-by: Qais Yousef <qyousef@xxxxxxxxxxx>
---
kernel/sched/fair.c | 13 +++++++++++--
1 file changed, 11 insertions(+), 2 deletions(-)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 3b88cf58fb45..18da54da48a5 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -11443,8 +11443,12 @@ static int sched_balance_rq(int this_cpu, struct rq *this_rq,
* We do not want newidle balance, which can be very
* frequent, pollute the failure counter causing
* excessive cache_hot migrations and active balances.
+ *
+ * Similarly for migration_misfit which is not related to
+ * load/util migration, don't pollute nr_balance_failed.
*/
- if (idle != CPU_NEWLY_IDLE)
+ if (idle != CPU_NEWLY_IDLE &&
+ env.migration_type != migrate_misfit)
sd->nr_balance_failed++;

if (need_active_balance(&env)) {
@@ -11527,8 +11531,13 @@ static int sched_balance_rq(int this_cpu, struct rq *this_rq,
* repeatedly reach this code, which would lead to balance_interval
* skyrocketing in a short amount of time. Skip the balance_interval
* increase logic to avoid that.
+ *
+ * Similarly misfit migration which is not necessarily an indication of
+ * the system being busy and requires lb to backoff to let it settle
+ * down.
*/
- if (env.idle == CPU_NEWLY_IDLE)
+ if (env.idle == CPU_NEWLY_IDLE ||
+ env.migration_type == migrate_misfit)
goto out;

/* tune up the balancing interval */
--
2.34.1