[PATCH 1/2] sched/fair: Prefer fully-idle SMT cores in asym-capacity idle selection

From: Andrea Righi

Date: Fri Apr 03 2026 - 01:38:00 EST


On systems with asymmetric CPU capacity (e.g., ACPI/CPPC reporting
different per-core frequencies), the wakeup path uses
select_idle_capacity() and prioritizes idle CPUs with higher capacity
for better task placement.

However, when those CPUs belong to SMT cores, their effective capacity
can be much lower than the nominal capacity when the sibling thread is
busy: SMT siblings compete for shared resources, so a "high capacity"
CPU that is idle but whose sibling is busy does not deliver its full
capacity. This effective capacity reduction cannot be modeled by the
static capacity value alone.

When SMT is active, teach asym-capacity idle selection to treat a
logical CPU as a weaker target if its physical core is only partially
idle: select_idle_capacity() no longer returns on the first idle CPU
whose static capacity fits the task when that CPU still has a busy
sibling, it keeps scanning for an idle CPU on a fully-idle core and only
if none qualify does it fall back to partially-idle cores, using shifted
fit scores so fully-idle cores win ties; asym_fits_cpu() applies the
same fully-idle core requirement when asym capacity and SMT are both
active.

This improves task placement, since partially-idle SMT siblings deliver
less than their nominal capacity. Favoring fully idle cores, when
available, can significantly enhance both throughput and wakeup latency
on systems with both SMT and CPU asymmetry.

No functional changes on systems with only asymmetric CPUs or only SMT.

Cc: K Prateek Nayak <kprateek.nayak@xxxxxxx>
Cc: Vincent Guittot <vincent.guittot@xxxxxxxxxx>
Cc: Dietmar Eggemann <dietmar.eggemann@xxxxxxx>
Cc: Christian Loehle <christian.loehle@xxxxxxx>
Cc: Koba Ko <kobak@xxxxxxxxxx>
Reported-by: Felix Abecassis <fabecassis@xxxxxxxxxx>
Signed-off-by: Andrea Righi <arighi@xxxxxxxxxx>
---
kernel/sched/fair.c | 36 ++++++++++++++++++++++++++++++++----
1 file changed, 32 insertions(+), 4 deletions(-)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index bf948db905ed1..7f09191014d18 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -7774,6 +7774,7 @@ static int select_idle_cpu(struct task_struct *p, struct sched_domain *sd, bool
static int
select_idle_capacity(struct task_struct *p, struct sched_domain *sd, int target)
{
+ bool prefers_idle_core = sched_smt_active() && test_idle_cores(target);
unsigned long task_util, util_min, util_max, best_cap = 0;
int fits, best_fits = 0;
int cpu, best_cpu = -1;
@@ -7787,6 +7788,7 @@ select_idle_capacity(struct task_struct *p, struct sched_domain *sd, int target)
util_max = uclamp_eff_value(p, UCLAMP_MAX);

for_each_cpu_wrap(cpu, cpus, target) {
+ bool preferred_core = !prefers_idle_core || is_core_idle(cpu);
unsigned long cpu_cap = capacity_of(cpu);

if (!available_idle_cpu(cpu) && !sched_idle_cpu(cpu))
@@ -7795,7 +7797,7 @@ select_idle_capacity(struct task_struct *p, struct sched_domain *sd, int target)
fits = util_fits_cpu(task_util, util_min, util_max, cpu);

/* This CPU fits with all requirements */
- if (fits > 0)
+ if (fits > 0 && preferred_core)
return cpu;
/*
* Only the min performance hint (i.e. uclamp_min) doesn't fit.
@@ -7803,9 +7805,30 @@ select_idle_capacity(struct task_struct *p, struct sched_domain *sd, int target)
*/
else if (fits < 0)
cpu_cap = get_actual_cpu_capacity(cpu);
+ /*
+ * fits > 0 implies we are not on a preferred core
+ * but the util fits CPU capacity. Set fits to -2 so
+ * the effective range becomes [-2, 0] where:
+ * 0 - does not fit
+ * -1 - fits with the exception of UCLAMP_MIN
+ * -2 - fits with the exception of preferred_core
+ */
+ else if (fits > 0)
+ fits = -2;
+
+ /*
+ * If we are on a preferred core, translate the range of fits
+ * of [-1, 0] to [-4, -3]. This ensures that an idle core
+ * is always given priority over (partially) busy core.
+ *
+ * A fully fitting idle core would have returned early and hence
+ * fits > 0 for preferred_core need not be dealt with.
+ */
+ if (preferred_core)
+ fits -= 3;

/*
- * First, select CPU which fits better (-1 being better than 0).
+ * First, select CPU which fits better (lower is more preferred).
* Then, select the one with best capacity at same level.
*/
if ((fits < best_fits) ||
@@ -7824,12 +7847,17 @@ static inline bool asym_fits_cpu(unsigned long util,
unsigned long util_max,
int cpu)
{
- if (sched_asym_cpucap_active())
+ if (sched_asym_cpucap_active()) {
/*
* Return true only if the cpu fully fits the task requirements
* which include the utilization and the performance hints.
+ *
+ * When SMT is active, also require that the core has no busy
+ * siblings.
*/
- return (util_fits_cpu(util, util_min, util_max, cpu) > 0);
+ return (!sched_smt_active() || is_core_idle(cpu)) &&
+ (util_fits_cpu(util, util_min, util_max, cpu) > 0);
+ }

return true;
}
--
2.53.0