Re: [tip: sched/core] sched/topology: Compute sd_weight considering cpuset partitions
From: K Prateek Nayak
Date: Sat Mar 21 2026 - 05:46:32 EST
Hello folks,
On 3/21/2026 2:29 PM, K Prateek Nayak wrote:
> So I managed to reproduce the crash and it is actually crashing at:
>
> last->next = first;
>
> in build_sched_groups(). If I print the span befora nd after we do
> the *sd = { ... }, I see:
>
> [ 0.056301] span before: 0
> [ 0.056559] span after:
> [ 0.056686] span double check:
>
> double check does a cpumask_pr_args(sched_domain_span(sd)).
> This solves the crash on top of this patch:
>
> diff --git a/kernel/sched/topology.c b/kernel/sched/topology.c
> index 79bab80af8f2..b347ae5d2786 100644
> --- a/kernel/sched/topology.c
> +++ b/kernel/sched/topology.c
> @@ -1693,6 +1693,8 @@ sd_init(struct sched_domain_topology_level *tl,
> .name = tl->name,
> };
>
> + cpumask_and(sd_span, cpu_map, tl->mask(tl, cpu));
> +
> WARN_ONCE((sd->flags & (SD_SHARE_CPUCAPACITY | SD_ASYM_CPUCAPACITY)) ==
> (SD_SHARE_CPUCAPACITY | SD_ASYM_CPUCAPACITY),
> "CPU capacity asymmetry not supported on SMT\n");
> ---
>
> And I see:
>
> [ 0.056479] span before: 0
> [ 0.056749] span after: 0
> [ 0.056881] span double check: 0
>
>
> But since span[] is a variable array at the end of sched_domain struct,
> doing a *sd = { ... } shouldn't modify it since the size isn't known at
> compile time and the compiler will only overwrite the fixed fields.
>
> Is there a compiler angle I'm missing here?
So this is what I've found: By default we have:
cpumask_size: 4
struct sched_domain size: 296
If I do:
diff --git a/include/linux/sched/topology.h b/include/linux/sched/topology.h
index a1e1032426dc..f0bebce274f7 100644
--- a/include/linux/sched/topology.h
+++ b/include/linux/sched/topology.h
@@ -148,7 +148,7 @@ struct sched_domain {
* by attaching extra space to the end of the structure,
* depending on how many CPUs the kernel has booted up with)
*/
- unsigned long span[];
+ unsigned long span[1];
};
static inline struct cpumask *sched_domain_span(struct sched_domain *sd)
---
I still see:
cpumask_size: 4
struct sched_domain size: 296
Which means we are overwriting the sd->span during *sd assignment even
with the variable length array at the end :-(
--
Thanks and Regards,
Prateek