Re: [PATCH 1/2] sched/fair: Prefer fully-idle SMT cores in asym-capacity idle selection
From: K Prateek Nayak
Date: Tue Apr 21 2026 - 07:23:54 EST
Hello Andrea,
On 4/21/2026 3:05 PM, Andrea Righi wrote:
> Actually... while preparing the series I realized that in select_idle_capacity()
> we may end up clearing the has_idle_cores hint even when the failure is due to
> affinity constraints (no fit CPU in the allowed cpumask), not only when no fully
> idle core is found in the system and this can lead to false has_idle_cores
> hints.
This is also the case with select_idle_cpu() but any core turning
idle will again reset the indicator so it should be fine for most
part where there is a lot of blocking + wakeup.
>
> At this point I'm wondering if it's better to just ignore the has_idle_cores
> hint completely in the smt+asym-cpu-capacity scenario (which would also simplify
> the exotic topology cases).
>
> I did some quick tests with this on Vera and I'm getting pretty much the same
> performance results. Opinions? Am I missing something?
I don't think so. Generally it is counterproductive to search a lot
in a busy system but I guess just making the path SMT aware give a
much better result compared to the baseline that it doesn't matter.
Can I trouble you to test the SIS_UTIL bailout with your series +
the topology changes:
diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 78f2d2c4e24f..1356bbdbccd4 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -7990,6 +7990,7 @@ select_idle_capacity(struct task_struct *p, struct sched_domain *sd, int target)
int fits, best_fits = 0;
int cpu, best_cpu = -1;
struct cpumask *cpus;
+ int nr = INT_MAX;
cpus = this_cpu_cpumask_var_ptr(select_rq_mask);
cpumask_and(cpus, sched_domain_span(sd), p->cpus_ptr);
@@ -7998,10 +7999,30 @@ select_idle_capacity(struct task_struct *p, struct sched_domain *sd, int target)
util_min = uclamp_eff_value(p, UCLAMP_MIN);
util_max = uclamp_eff_value(p, UCLAMP_MAX);
+ if (sched_feat(SIS_UTIL) && sd->shared) {
+ /*
+ * Increment because !--nr is the condition to stop scan.
+ *
+ * Since "sd" is "sd_llc" for target CPU dereferenced in the
+ * caller, it is safe to directly dereference "sd->shared".
+ * Topology bits always ensure it assigned for "sd_llc" abd it
+ * cannot disappear as long as we have a RCU protected
+ * reference to one the associated "sd" here.
+ */
+ nr = READ_ONCE(sd->shared->nr_idle_scan) + 1;
+ /* overloaded LLC is unlikely to have idle cpu/core */
+ if (nr == 1)
+ return -1;
+ }
+
for_each_cpu_wrap(cpu, cpus, target) {
bool preferred_core = !prefers_idle_core || is_core_idle(cpu);
unsigned long cpu_cap = capacity_of(cpu);
+ /* We have found a good enough target. Just use it. */
+ if (--nr <= 0 && best_fits == -4)
+ return best_cpu;
+
if (!choose_idle_cpu(cpu, p))
continue;
---
You can also try "best_fits <= -3" in that last bailout condition and
see if that help.
--
Thanks and Regards,
Prateek