Re: [PATCH] sched/topology: improve topology_span_sane speed

From: Vishal Chourasia
Date: Wed Oct 23 2024 - 09:20:10 EST


On Mon, Oct 21, 2024 at 11:20:58AM -0500, Steve Wahl wrote:
> On Fri, Oct 18, 2024 at 05:05:43PM +0530, Vishal Chourasia wrote:
> > On Thu, Oct 10, 2024 at 10:51:11AM -0500, Steve Wahl wrote:
> > @@ -2417,9 +2446,6 @@ build_sched_domains(const struct cpumask *cpu_map, struct sched_domain_attr *att
> > > sd = NULL;
> > > for_each_sd_topology(tl) {
> > >
> > > - if (WARN_ON(!topology_span_sane(tl, cpu_map, i)))
> > > - goto error;
> > > -
> > > sd = build_sched_domain(tl, cpu_map, attr, sd, i);
> > >
> > > has_asym |= sd->flags & SD_ASYM_CPUCAPACITY;
> > > @@ -2433,6 +2459,9 @@ build_sched_domains(const struct cpumask *cpu_map, struct sched_domain_attr *att
> > > }
> > > }
> > >
> > > + if (WARN_ON(!topology_span_sane(cpu_map)))
> > > + goto error;
> > Hi Steve,
>
> Vishal, thank you for taking the time to review.
>
> > Is there any reason why above check is done after initializing
> > sched domain struct for all the CPUs in the cpu_map?
>
> The original check was done in the same for_each_sd_topology(tl) loop
> that calls build_sched_domain(). I had trouble 100% convincing myself
> that calls to build_sched_domain() on the previous levels couldn't
> affect calls to tl->mask() in later levels, so I placed the new check
> after all calls to build_sched_domain were complete.
>
Yeah, I don't see build_sched_domain() modifying the cpumask
returned from tl->mask(cpu)

> > It looks to me, that this check can be performed before the call to
> > __visit_domain_allocation_hell() in the build_sched_domains()
> > resulting in early return if topology_span_sane() detects incorrect
> > topology.
>
> This might be OK to do. I would greatly appreciate somebody well
> versed in this code area telling me for certain that it would work.
>
Same.

> > Also, the error path in the current code only cleans up d->rd struct, keeping
> > all the work done by build_sched_domain() inside the loop and __alloc_sdt()
> > called from __visit_domain_allocation_hell()
> >
> > is it because we need all that work to remain intact?
>
> I'm not seeing this. The return from __visit_domain_allocation_hell()
> is stored in alloc_state immediately checked to be == sa_rootdomain;
> if not, the error path is taken, deallocating everything and
> returning.
>
> The rest of the function does not touch alloc_state, so any error from
> that point on makes the call to __free_domain_allocs with what ==
> sa_rootdomain, which seems to undo everything.
>
> Are you possibly missing the fallthroughs in __free_domain_allocs()
> even though they're clearly emphasized?
>
Yes, you are right. Thank you for pointing that out.

> > static void __free_domain_allocs(struct s_data *d, enum s_alloc what,
> > const struct cpumask *cpu_map)
> > {
> > switch (what) {
> > case sa_rootdomain:
> > if (!atomic_read(&d->rd->refcount))
> > free_rootdomain(&d->rd->rcu);
> > fallthrough;
> > case sa_sd:
> > free_percpu(d->sd);
> > fallthrough;
> > case sa_sd_storage:
> > __sdt_free(cpu_map);
> > fallthrough;
> > case sa_none:
> > break;
> > }
> > }
> >
>
> Thanks,
>
> --> Steve Wahl
>
> --
> Steve Wahl, Hewlett Packard Enterprise