Re: [PATCH v2 2/5] sched/fair: Attach sched_domain_shared to sd_asym_cpucapacity
From: Andrea Righi
Date: Tue May 19 2026 - 03:58:22 EST
On Tue, May 19, 2026 at 01:17:20PM +0530, K Prateek Nayak wrote:
> Hello Andrea,
>
> Thank you for taking a look at the diff!
BTW I just re-ran the NVBLAS benchmark on a Vera Rubin machine using
queue:sched/core + this on top, all good!
Thanks,
-Andrea
>
> On 5/19/2026 12:13 PM, Andrea Righi wrote:
> > Hi Prateek,
> >
> > On Tue, May 19, 2026 at 11:22:32AM +0530, K Prateek Nayak wrote:
> >> Hello Peter, Andrea,
> >>
> >> On 5/19/2026 2:28 AM, Peter Zijlstra wrote:
> >>> @@@ -2775,20 -3049,16 +3107,15 @@@ build_sched_domains(const struct cpumas
> >>> if (!sd)
> >>> continue;
> >>>
> >>> + if (has_asym)
> >>> - asym_claimed = claim_asym_sched_domain_shared(&d, i);
> >>> ++ claim_asym_sched_domain_shared(&d, i);
> >>> +
> >>> /* First, find the topmost SD_SHARE_LLC domain */
> >>> while (sd->parent && (sd->parent->flags & SD_SHARE_LLC))
> >>> sd = sd->parent;
> >>>
> >>> if (sd->flags & SD_SHARE_LLC) {
> >>> - /*
> >>> - * Initialize the sd->shared for SD_SHARE_LLC unless
> >>> - * the asym path above already claimed it.
> >>> - */
> >>> - if (!asym_claimed)
> >>> - init_sched_domain_shared(&d, sd);
> >>> - int sd_id = cpumask_first(sched_domain_span(sd));
> >>> -
> >>> - sd->shared = *per_cpu_ptr(d.sds, sd_id);
> >>> - atomic_set(&sd->shared->nr_busy_cpus, sd->span_weight);
> >>> - atomic_inc(&sd->shared->ref);
> >>> ++ init_sched_domain_shared(&d, sd);
> >>
> >> This will run into a small problem with "nr_idle_scan" if
> >> cpumask_first(sched_domain_span(sd)) is the same for both sd_asym and
> >> sd_llc.
> >
> > Ah, good catch! When cpumask_first(asym_span) == cpumask_first(llc_span)
> > (big.LITTLE typical case), both sd_asym->shared and sd_llc->shared would alias
> > to d->sds[0].
> >
> >>
> >> Load balancer at different domains will populate "nr_idle_scan" with
> >> different values and they alias to same ->shared if one isn't
> >> degenerated and I believe there is at least one way to hit the WARN_ON()
> >> from cpu_attach_domain() if the SD_ASYM_CPUCAPACITY_FULL comes before
> >> the last SD_SHARE_LLC domain and the latter is degenerated.
> >>
> >> How about this:
> >>
> >> (On top of queue:sched/core; Lightly tested on !ASYM_CPUCAPACITY system)
> >>
> >> diff --git a/include/linux/sched/topology.h b/include/linux/sched/topology.h
> >> index fe09d3268bc9..1d2c98dca211 100644
> >> --- a/include/linux/sched/topology.h
> >> +++ b/include/linux/sched/topology.h
> >> @@ -67,7 +67,15 @@ struct sched_domain_shared {
> >> atomic_t ref;
> >> atomic_t nr_busy_cpus;
> >> int has_idle_cores;
> >> - int nr_idle_scan;
> >> + union {
> >> + int nr_idle_scan;
> >> + /*
> >> + * Used during allocation to claim the
> >> + * sched_domain_shared object at
> >> + * multiple levels.
> >
> > I think between build and the first LB tick, readers of nr_idle_scan may observe
> > leftover SD_* flags in nr_idle_scan. This shouldn't be a problem and should
> > self-heal soon, but maybe it's worth a comment? Something like:
> >
> > * Note: between build and the first periodic LB tick, which
> > * rewrites the union via update_idle_cpu_scan(), readers of
> > * nr_idle_scan may observe the transient SD_* flag value as
> > * the scan bound. The flag bits are small positive integers,
> > * so the effect is just a slightly relaxed scan bound for one
> > * window and self-heals on the first tick.
>
> Ack! We start with 0 today which isn't representative of the system
> state either and depend on the eventual correctness to fix the value
> after a hotplug / cpuset.
>
> I can fold in the note and resend it as a formal patch.
>
> Peter, would you prefer a formal patch or would you like to do this
> (or something similar) as a part of the conflict resolution itself?
>
> >> + BUG_ON(!sd->shared);
> >
> > Unreachable in practice, but should we have a WARN_ON_ONCE() +
> > bail/early-return? In this way we'd fall back to using LLC's shared for
> > sd_balance_shared, which seems nicer than a BUG_ON().
>
> Ack! We can just use the last CPU's "sds" if we don't end up finding a
> free one as a backup. I just had the BUG_ON() to easily spot my VM
> crashing ;-)
>
> --
> Thanks and Regards,
> Prateek
>