[PATCH v5 5/6] sched/fair: Allow load balancing between CPUs of identical capacity
From: Ricardo Neri
Date: Mon Jun 22 2026 - 19:57:22 EST
sched_balance_find_src_rq() avoids selecting a runqueue with a single
running task as busiest if doing so results in migrating the task to a
CPU with less than ~5% of extra capacity. It also unintentionally
prevents migrations between CPUs of identical capacity.
When CONFIG_SCHED_CLUSTER is enabled, load should be balanced across
clusters of CPUs with the same capacity. Allowing migration between CPUs
of identical capacity is necessary to meet this goal.
Use arch_scale_cpu_capacity() to reflect architectural capacity, excluding
runtime reductions due to side activity or thermal pressure. Guard this
check with the sched_cluster_active static key so that systems without
cluster topology are unaffected.
Tested-by: Christian Loehle <christian.loehle@xxxxxxx>
Signed-off-by: Ricardo Neri <ricardo.neri-calderon@xxxxxxxxxxxxxxx>
---
Changes in v5:
* Optimized logic to identify same-arch clusters only when needed.
* Added Tested-by tag from Christian. Thanks!
Changes in v4:
* Implemented the check for cluster with a local variable for improved
readability.
Changes in v3:
* Reverted the inverted capacity check; the inverted form incorrectly
allows migrations to CPUs of slightly less capacity.
* Guarded the check for architectural capacity with the
sched_cluster_active static key.
Changes in v2:
* Used arch_scale_cpu_capacity() instead of capacity_of() to ignore
runtime variability.
* Inverted the check for runtime capacity. (Christian)
* Reworded patch description for clarity.
---
kernel/sched/fair.c | 9 ++++++++-
1 file changed, 8 insertions(+), 1 deletion(-)
diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index e55eb019d2c9..f4eb55cad54d 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -12992,13 +12992,20 @@ static struct rq *sched_balance_find_src_rq(struct lb_env *env,
*/
if (env->sd->flags & SD_ASYM_CPUCAPACITY &&
nr_running == 1) {
+ bool same_arch_cluster = static_branch_unlikely(&sched_cluster_active) &&
+ (arch_scale_cpu_capacity(env->dst_cpu) ==
+ arch_scale_cpu_capacity(i));
bool smt_degraded_cap = sched_smt_active() && !is_core_idle(i);
/*
* Busy SMT siblings reduce the capacity of CPU @i. Do
* not skip it in this case.
+ *
+ * CONFIG_SCHED_CLUSTER requires balancing load across clusters
+ * of identical capacity. Use architectural capacity to ignore
+ * runtime variability.
*/
- if (!smt_degraded_cap &&
+ if (!smt_degraded_cap && !same_arch_cluster &&
!capacity_greater(capacity_of(env->dst_cpu), capacity))
continue;
}
--
2.43.0