Re: [PATCH v2 3/4] sched/fair: Allow load balancing between CPUs of identical capacity

From: Christian Loehle

Date: Wed May 06 2026 - 09:13:29 EST


On 4/29/26 22:19, Ricardo Neri wrote:
> sched_balance_find_src_rq() avoids selecting a runqueue with a single
> running task as busiest if doing so results in migrating the task to a
> CPU with less than ~5% of extra capacity. It also unintentionally
> prevents migrations between CPUs of identical capacity.
>
> When CONFIG_SCHED_CLUSTER is enabled, load should be balanced across
> clusters of CPUs with the same capacity. Allowing migration between CPUs
> of identical capacity is necessary to meet this goal.
>
> We are interested in the architectural capacity of the involved CPUs,
> excluding any reductions due to side activity or thermal pressure. Use
> arch_scale_cpu_capacity().
>
> While here, invert the check for runtime capacity for clarity.
>
> Signed-off-by: Ricardo Neri <ricardo.neri-calderon@xxxxxxxxxxxxxxx>
> ---
> Changes since v1:
> * Used arch_scale_cpu_capacity() instead of capacity_of() to ignore
> runtime variability.
> * Inverted the check for runtime capacity. (Christian)
> * Reworded patch description for clarity.
> ---
> kernel/sched/fair.c | 7 ++++++-
> 1 file changed, 6 insertions(+), 1 deletion(-)
>
> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> index 166a5b109e0e..4105717e64fe 100644
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -11816,9 +11816,14 @@ static struct rq *sched_balance_find_src_rq(struct lb_env *env,
> * eventually lead to active_balancing high->low capacity.
> * Higher per-CPU capacity is considered better than balancing
> * average load.
> + *
> + * Cluster scheduling requires balancing load across clusters
> + * of identical capacity. Use architectural capacity to ignore
> + * runtime variability.
> */
> if (env->sd->flags & SD_ASYM_CPUCAPACITY &&
> - !capacity_greater(capacity_of(env->dst_cpu), capacity) &&
> + arch_scale_cpu_capacity(env->dst_cpu) != arch_scale_cpu_capacity(i) &&
> + capacity_greater(capacity, capacity_of(env->dst_cpu)) &&
> nr_running == 1)
> continue;
>
>

I wonder if we shouldn't use capacity_greater() margin for both, i.e.
capacity_greater(arch_scale_cpu_capacity(i), arch_scale_cpu_capacity(env->dst_cpu)) &&

For example the orion o6 has a cluster with 1024 and one with 984, If we allow balancing
984->984 I think it's only consistent to also allow 984->1024.