Re: [PATCH 4/5] sched/fair: Add SIS_UTIL support to select_idle_capacity()

From: Vincent Guittot

Date: Fri Apr 24 2026 - 08:32:50 EST


On Thu, 23 Apr 2026 at 09:42, Andrea Righi <arighi@xxxxxxxxxx> wrote:
>
> From: K Prateek Nayak <kprateek.nayak@xxxxxxx>
>
> Add to select_idle_capacity() the same SIS_UTIL-controlled idle-scan
> mechanism, already used by select_idle_cpu(): when sched_feat(SIS_UTIL)
> is enabled and the LLC domain has sched_domain_shared data, derive the
> per-attempt scan limit from sd->shared->nr_idle_scan.
>
> That bounds the walk on large LLCs and allows an early return once the
> scan limit is reached, if we already picked a sufficiently strong
> idle-core candidate (best_fits == -4).
>
> Signed-off-by: K Prateek Nayak <kprateek.nayak@xxxxxxx>
> ---
> kernel/sched/fair.c | 21 +++++++++++++++++++++
> 1 file changed, 21 insertions(+)
>
> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> index 9bd9dc6e0882e..6b67049f04c3e 100644
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -8002,6 +8002,7 @@ select_idle_capacity(struct task_struct *p, struct sched_domain *sd, int target)
> int fits, best_fits = 0;
> int cpu, best_cpu = -1;
> struct cpumask *cpus;
> + int nr = INT_MAX;
>
> cpus = this_cpu_cpumask_var_ptr(select_rq_mask);
> cpumask_and(cpus, sched_domain_span(sd), p->cpus_ptr);
> @@ -8010,10 +8011,30 @@ select_idle_capacity(struct task_struct *p, struct sched_domain *sd, int target)
> util_min = uclamp_eff_value(p, UCLAMP_MIN);
> util_max = uclamp_eff_value(p, UCLAMP_MAX);
>
> + if (sched_feat(SIS_UTIL) && sd->shared) {
> + /*
> + * Increment because !--nr is the condition to stop scan.
> + *
> + * Since "sd" is "sd_llc" for target CPU dereferenced in the
> + * caller, it is safe to directly dereference "sd->shared".
> + * Topology bits always ensure it assigned for "sd_llc" and it
> + * cannot disappear as long as we have a RCU protected
> + * reference to one the associated "sd" here.
> + */
> + nr = READ_ONCE(sd->shared->nr_idle_scan) + 1;
> + /* overloaded LLC is unlikely to have idle cpu/core */
> + if (nr == 1)
> + return -1;

The comment below applies to select_idle_cpu but we want same behavior
for both function
If test_idle_cores is true we will not look for it whereas we don't
care about nr value when test_idle_core is true in the
for_each_cpu_wrap loop


> + }
> +
> for_each_cpu_wrap(cpu, cpus, target) {
> bool preferred_core = !prefers_idle_core || is_core_idle(cpu);
> unsigned long cpu_cap = capacity_of(cpu);
>
> + /* We have found a good enough target. Just use it. */
> + if (--nr <= 0 && best_fits == -4)
> + return best_cpu;

In select_idle_cpu(), we return immediatly when nr == 0 and
test_idle_cores is false but we loop on all cpus if test_idle_cores is
true until we found an idle core. In the case of
select_idle_capacity(), I agree that util_fits_cpu() add another level
but shouldn't we continue to loop even if we found a best_fits == -4

> +
> if (!choose_idle_cpu(cpu, p))
> continue;
>
> --
> 2.54.0
>