Re: [PATCH 5/5] sched/fair: Add SIS_UTIL support to select_idle_capacity()

From: Vincent Guittot

Date: Thu May 07 2026 - 02:48:03 EST

On Wed, 6 May 2026 at 20:11, Andrea Righi <arighi@xxxxxxxxxx> wrote:
>
> Hi Dietmar and Vincent,
>
> On Wed, May 06, 2026 at 07:01:35PM +0200, Dietmar Eggemann wrote:
> > On 06.05.26 14:59, Vincent Guittot wrote:
> > > On Tue, 28 Apr 2026 at 16:44, Andrea Righi <arighi@xxxxxxxxxx> wrote:
> > >>
> > >> From: K Prateek Nayak <kprateek.nayak@xxxxxxx>
> >
> > [...]
> >
> > >> @@ -8026,10 +8027,28 @@ select_idle_capacity(struct task_struct *p, struct sched_domain *sd, int target)
> > >> util_min = uclamp_eff_value(p, UCLAMP_MIN);
> > >> util_max = uclamp_eff_value(p, UCLAMP_MAX);
> > >>
> > >> + if (sched_feat(SIS_UTIL) && sd->shared) {
> > >> + /*
> > >> + * Same nr_idle_scan hint as select_idle_cpu(), nr only limits
> > >> + * the scan when not preferring an idle core.
> > >> + */
> > >> + nr = READ_ONCE(sd->shared->nr_idle_scan) + 1;
> > >> + /* overloaded domain is unlikely to have idle cpu/core */
> > >> + if (nr == 1)
> > >> + return -1;
> > >> + }
> > >> +
> > >> for_each_cpu_wrap(cpu, cpus, target) {
> > >> bool preferred_core = !prefers_idle_core || is_core_idle(cpu);
> > >> unsigned long cpu_cap = capacity_of(cpu);
> > >>
> > >> + /*
> > >> + * Good-enough early exit (mirrors select_idle_cpu() logic).
> > >> + */
> > >> + if (!prefers_idle_core &&
> > >> + --nr <= 0 && best_fits == ASYM_IDLE_CORE_UCLAMP_MISFIT)
> > >
> > > With SMT, !prefers_idle_core implies that there is no idle core; Is
> > > best_fits == ASYM_IDLE_CORE_UCLAMP_MISFIT really expected in such case
> > > ?
> > >
> > > With !SMT, !prefers_idle_core is always true and we will bail out
> > > early as expected
> >
> > I struggle to comprehend:
> >
> > I assume the mirrored select_idle_cpu() logic is:
> >
> > for_each_cpu_wrap(cpu, cpus, target + 1)
> >
> > if (has_idle_core)
> >
> > else
> > if (--nr <= 0)
> > return -1
>
> So, the logic in select_idle_cpu() is that as soon as nr <= 0, we stops the walk
> and returns -1, without any "only stop if the answer is good enough" guard.
>
> With this change in select_idle_capacity() when nr is exhausted, we stop only if
> best_cpu is "good enough" (ASYM_IDLE_CORE_UCLAMP_MISFIT), otherwise we keep
> scanning. Therefore, we're not perfectly mirroring select_idle_cpu().

Okay, one reason of my confusion is that

With !SMT, preferred_core is always true and CPU == core in asym_fits_state

With SMT and test_idle_cores being true, preferred_core reflects
core/CPU idleness

But with SMT and test_idle_cores being false, preferred_core is
always false and we are back to the !SMT case where CPU == core in the
asym_fits_state

So the condition is relevant
if (!prefers_idle_core && --nr <= 0 && best_fits ==
ASYM_IDLE_CORE_UCLAMP_MISFIT)

We need a better description of which asym_fits_state range is used in
which conditions

>
> >
> > Should this condition not be just:
> >
> > if (!prefers_idle_core && --nr <= 0)
> > return best_cpu
>
> I think this would match more closely select_idle_cpu(). However,
> select_idle_cpu() doesn't have the "best partial idle placement" logic at all,
> it either returns an idle CPU or -1.
>
> I guess it's a policy decision here: do we want to mirror exactly the scan bound
> (nr <= 0 -> hard stop) or allow extra scan based on the ranking quality
> (nr <= 0 -> stop early if satisfied)?

The current proposal is ok for me:
With SMT and an idle core, we loop until finding the best idle core
Without SMT or idle core, we loop until we find a CPU on which the
task utilization matches at least the max capacity

>
> Thanks,
> -Andrea
>
> >
> > since if we do a:
> >
> > if (!choose_idle_cpu(cpu, p)))
> > continue;
> >
> > right after that?
> >
> > best_cpu is -1 by default so sis() will return target, in case we
> > already found a best_cpu then sis() will return this instead.
> >
> > What do I miss here?