[PATCH v2 2/4] sched/fair: Skip misfit load accounting when the destination CPU cannot help

From: Ricardo Neri

Date: Wed Apr 29 2026 - 17:22:22 EST


In domains with asymmetric capacity, identifying misfit load in a
scheduling group is not useful when the destination CPU cannot help (i.e.,
its capacity exceeds the group's maximum CPU capacity by less than ~5%). In
such cases, it also prevents load balance among clusters of equal capacity
when CONFIG_SCHED_CLUSTER is enabled. This happens because
update_sd_pick_busiest() skips candidate groups of type misfit_task if the
destination CPU has similar capacity.

Skipping misfit load accounting in this situation allows the group to be
classified as has_spare or fully_busy and lets load balancing proceed. Keep
marking scheduling groups as overloaded when misfit tasks are present. This
flag propagates to the root domain and allows bigger CPUs in it to help
via newly idle balance.

Signed-off-by: Ricardo Neri <ricardo.neri-calderon@xxxxxxxxxxxxxxx>
---
Changes since v1:
* Moved the check of the destination CPU capacity inside the code block
used for SD_ASYM_CPUCAPACITY. v1 inadvertedly broke the mutual
exclusion of the sched_reduced_capacity() path.
* Keep marking the root domain as overloaded to allow bigger CPUs to
help. (sashiko)
* Fixed patch description to clarify that the capacity_greater() looks
differences of 5% or more. (Christian)
* Reworded the patch description for clarity.
* I did not include the Reviewed-by tag from Christian since the patch
changed functionally.
---
kernel/sched/fair.c | 20 +++++++++++++++++---
1 file changed, 17 insertions(+), 3 deletions(-)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 0dbed82aa63f..166a5b109e0e 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -10719,10 +10719,24 @@ static inline void update_sg_lb_stats(struct lb_env *env,
continue;

if (sd_flags & SD_ASYM_CPUCAPACITY) {
- /* Check for a misfit task on the cpu */
- if (sgs->group_misfit_task_load < rq->misfit_task_load) {
- sgs->group_misfit_task_load = rq->misfit_task_load;
+ if (rq->misfit_task_load) {
+ /*
+ * Always mark the domain overloaded so big CPUs
+ * can pick up misfit tasks via newly idle
+ * balance.
+ */
*sg_overloaded = 1;
+
+ /*
+ * Only account misfit load if @dst_cpu can
+ * help, otherwise the group may be classified
+ * as misfit_task and update_sd_pick_busiest()
+ * will skip it.
+ */
+ if (capacity_greater(capacity_of(env->dst_cpu),
+ group->sgc->max_capacity) &&
+ (sgs->group_misfit_task_load < rq->misfit_task_load))
+ sgs->group_misfit_task_load = rq->misfit_task_load;
}
} else if (env->idle && sched_reduced_capacity(rq, env->sd)) {
/* Check for a task running on a CPU with reduced capacity */

--
2.43.0