[PATCH v3] arch_topology: Make cluster topology span at least SMT CPUs

From: Yicong Yang
Date: Mon Sep 05 2022 - 08:35:23 EST


From: Yicong Yang <yangyicong@xxxxxxxxxxxxx>

Currently cpu_clustergroup_mask() will return CPU mask if cluster span more
or the same CPUs as cpu_coregroup_mask(). This will result topology borken
on non-Cluster SMT machines when building with CONFIG_SCHED_CLUSTER=y.

Test with:
qemu-system-aarch64 -enable-kvm -machine virt \
-net none \
-cpu host \
-bios ./QEMU_EFI.fd \
-m 2G \
-smp 48,sockets=2,cores=12,threads=2 \
-kernel $Image \
-initrd $Rootfs \
-nographic
-append "rdinit=init console=ttyAMA0 sched_verbose loglevel=8"

We'll get below error:
[ 3.084568] BUG: arch topology borken
[ 3.084570] the SMT domain not a subset of the CLS domain

Since cluster is a level higher than SMT, fix this by making cluster
spans at least SMT CPUs.

Cc: Sudeep Holla <sudeep.holla@xxxxxxx>
Cc: Vincent Guittot <vincent.guittot@xxxxxxxxxx>
Cc: Ionela Voinescu <ionela.voinescu@xxxxxxx>
Cc: Greg KH <gregkh@xxxxxxxxxxxxxxxxxxx>
Fixes: bfcc4397435d ("arch_topology: Limit span of cpu_clustergroup_mask()")
Signed-off-by: Yicong Yang <yangyicong@xxxxxxxxxxxxx>
---
Change since v2:
- Use topology_sibling_cpumask() instead of cpu_smt_mask(), which is unavailable
when CONFIG_SCHED_SMT=n. Sorry for the build regression.
- Drop RB from Ionela and Sudeep since code changed. Expect a regain. Thanks!
Link: https://lore.kernel.org/lkml/20220825092007.8129-1-yangyicong@xxxxxxxxxx/

Change since v1:
- mention the kernel config CONFIG_SCHED_CLUSTER=y, per Ionela
Link:https://lore.kernel.org/lkml/20220823073044.58697-1-yangyicong@xxxxxxxxxx/

drivers/base/arch_topology.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/base/arch_topology.c b/drivers/base/arch_topology.c
index 0424b59b695e..7e7e373ffab2 100644
--- a/drivers/base/arch_topology.c
+++ b/drivers/base/arch_topology.c
@@ -724,7 +724,7 @@ const struct cpumask *cpu_clustergroup_mask(int cpu)
*/
if (cpumask_subset(cpu_coregroup_mask(cpu),
&cpu_topology[cpu].cluster_sibling))
- return get_cpu_mask(cpu);
+ return topology_sibling_cpumask(cpu);

return &cpu_topology[cpu].cluster_sibling;
}
--
2.24.0