Re: [PATCH v5 5/6] sched/fair: Allow load balancing between CPUs of identical capacity
From: Andrea Righi
Date: Sat Jun 27 2026 - 15:07:47 EST
Hi Ricardo,
On Mon, Jun 22, 2026 at 05:05:55PM -0700, Ricardo Neri wrote:
> sched_balance_find_src_rq() avoids selecting a runqueue with a single
> running task as busiest if doing so results in migrating the task to a
> CPU with less than ~5% of extra capacity. It also unintentionally
> prevents migrations between CPUs of identical capacity.
>
> When CONFIG_SCHED_CLUSTER is enabled, load should be balanced across
> clusters of CPUs with the same capacity. Allowing migration between CPUs
> of identical capacity is necessary to meet this goal.
>
> Use arch_scale_cpu_capacity() to reflect architectural capacity, excluding
> runtime reductions due to side activity or thermal pressure. Guard this
> check with the sched_cluster_active static key so that systems without
> cluster topology are unaffected.
>
> Tested-by: Christian Loehle <christian.loehle@xxxxxxx>
> Signed-off-by: Ricardo Neri <ricardo.neri-calderon@xxxxxxxxxxxxxxx>
> ---
> Changes in v5:
> * Optimized logic to identify same-arch clusters only when needed.
> * Added Tested-by tag from Christian. Thanks!
>
> Changes in v4:
> * Implemented the check for cluster with a local variable for improved
> readability.
>
> Changes in v3:
> * Reverted the inverted capacity check; the inverted form incorrectly
> allows migrations to CPUs of slightly less capacity.
> * Guarded the check for architectural capacity with the
> sched_cluster_active static key.
>
> Changes in v2:
> * Used arch_scale_cpu_capacity() instead of capacity_of() to ignore
> runtime variability.
> * Inverted the check for runtime capacity. (Christian)
> * Reworded patch description for clarity.
> ---
> kernel/sched/fair.c | 9 ++++++++-
> 1 file changed, 8 insertions(+), 1 deletion(-)
>
> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> index e55eb019d2c9..f4eb55cad54d 100644
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -12992,13 +12992,20 @@ static struct rq *sched_balance_find_src_rq(struct lb_env *env,
> */
> if (env->sd->flags & SD_ASYM_CPUCAPACITY &&
> nr_running == 1) {
> + bool same_arch_cluster = static_branch_unlikely(&sched_cluster_active) &&
> + (arch_scale_cpu_capacity(env->dst_cpu) ==
> + arch_scale_cpu_capacity(i));
I find same_arch_cluster a bit misleading. It sounds like "these two CPUs belong
to the same cluster", while what it actually checks is whether a cluster
topology exists somewhere in the root domain and the two CPUs have exactly the
same architectural capacity. Am I understanding it correctly?
If so, would something like same_arch_capacity or cluster_equal_capacity be a
better name? I think either would make the intent of the code a bit clearer.
Thanks,
-Andrea
> bool smt_degraded_cap = sched_smt_active() && !is_core_idle(i);
>
> /*
> * Busy SMT siblings reduce the capacity of CPU @i. Do
> * not skip it in this case.
> + *
> + * CONFIG_SCHED_CLUSTER requires balancing load across clusters
> + * of identical capacity. Use architectural capacity to ignore
> + * runtime variability.
> */
> - if (!smt_degraded_cap &&
> + if (!smt_degraded_cap && !same_arch_cluster &&
> !capacity_greater(capacity_of(env->dst_cpu), capacity))
> continue;
> }
>
> --
> 2.43.0
>