[patch V4 02/41] cpu/SMT: Make SMT control more robust against enumeration failures

From: Thomas Gleixner
Date: Mon Aug 14 2023 - 04:54:45 EST


The SMT control mechanism got added as speculation attack vector
mitigation. The implemented logic relies on the primary thread mask to
be set up properly.

This turns out to be an issue with XEN/PV guests because their CPU hotplug
mechanics do not enumerate APICs and therefore the mask is never correctly
populated.

This went unnoticed so far because by chance XEN/PV ends up with
smp_num_siblings == 2. So smt_hotplug_control stays at its default value
CPU_SMT_ENABLED and the primary thread mask is never evaluated in the
context of CPU hotplug.

This stopped "working" with the upcoming overhaul of the topology
evaluation which legitimately provides a fake topology for XEN/PV. That
sets smp_num_siblings to 1, which causes the core CPU hot-plug core to
refuse to bring up the APs.

This happens because smt_hotplug_control is set to CPU_SMT_NOT_SUPPORTED
which causes cpu_smt_allowed() to evaluate the unpopulated primary thread
mask with the conclusion that all non-boot CPUs are not valid to be
plugged.

Make cpu_smt_allowed() more robust and take CPU_SMT_NOT_SUPPORTED and
CPU_SMT_NOT_IMPLEMENTED into account. Rename it to cpu_bootable() while at
it as that makes it more clear what the function is about.

The primary mask issue on x86 XEN/PV needs to be addressed separately as
there are users outside of the CPU hotplug code too.

Fixes: 05736e4ac13c ("cpu/hotplug: Provide knobs to control SMT")
Reported-by: Juergen Gross <jgross@xxxxxxxx>
Signed-off-by: Thomas Gleixner <tglx@xxxxxxxxxxxxx>
Tested-by: Juergen Gross <jgross@xxxxxxxx>
Tested-by: Sohil Mehta <sohil.mehta@xxxxxxxxx>
Tested-by: Michael Kelley <mikelley@xxxxxxxxxxxxx>

---
V2: Rename cpu_smt_allowed() - Borislav
---
kernel/cpu.c | 16 +++++++++++-----
1 file changed, 11 insertions(+), 5 deletions(-)

--- a/kernel/cpu.c
+++ b/kernel/cpu.c
@@ -625,11 +625,17 @@ static int __init smt_cmdline_disable(ch
}
early_param("nosmt", smt_cmdline_disable);

-static inline bool cpu_smt_allowed(unsigned int cpu)
+static inline bool cpu_bootable(unsigned int cpu)
{
if (cpu_smt_control == CPU_SMT_ENABLED)
return true;

+ if (cpu_smt_control == CPU_SMT_NOT_SUPPORTED)
+ return true;
+
+ if (cpu_smt_control == CPU_SMT_NOT_IMPLEMENTED)
+ return true;
+
if (topology_is_primary_thread(cpu))
return true;

@@ -660,7 +666,7 @@ static inline const struct cpumask *cpuh
return cpu_primary_thread_mask;
}
#else
-static inline bool cpu_smt_allowed(unsigned int cpu) { return true; }
+static inline bool cpu_bootable(unsigned int cpu) { return true; }
static inline bool cpuhp_smt_aware(void) { return false; }
static inline const struct cpumask *cpuhp_get_primary_thread_mask(void)
{
@@ -768,10 +774,10 @@ static int bringup_wait_for_ap_online(un
* SMT soft disabling on X86 requires to bring the CPU out of the
* BIOS 'wait for SIPI' state in order to set the CR4.MCE bit. The
* CPU marked itself as booted_once in notify_cpu_starting() so the
- * cpu_smt_allowed() check will now return false if this is not the
+ * cpu_bootable() check will now return false if this is not the
* primary sibling.
*/
- if (!cpu_smt_allowed(cpu))
+ if (!cpu_bootable(cpu))
return -ECANCELED;
return 0;
}
@@ -1699,7 +1705,7 @@ static int cpu_up(unsigned int cpu, enum
err = -EBUSY;
goto out;
}
- if (!cpu_smt_allowed(cpu)) {
+ if (!cpu_bootable(cpu)) {
err = -EPERM;
goto out;
}