[PATCH v2] x86/tsx: fix KVM guest live migration for tsx=on

From: Jon Kohler
Date: Mon Apr 11 2022 - 16:08:14 EST


Move automatic disablement for TSX microcode deprecation from tsx_init() to
x86_get_tsx_auto_mode(), such that systems with tsx=on will continue to
see the TSX CPU features (HLE, RTM) even on updated microcode.

KVM live migration could be possibly be broken in 5.14+ commit 293649307ef9
("x86/tsx: Clear CPUID bits when TSX always force aborts"). Consider the
following scenario:

1. KVM hosts clustered in a live migration capable setup.
2. KVM guests have TSX CPU features HLE and/or RTM presented.
3. One of the three maintenance events occur:
3a. An existing host running kernel >= 5.14 in the pool updated with the
new microcode.
3b. A new host running kernel >= 5.14 is commissioned that already has the
microcode update preloaded.
3c. All hosts are running kernel < 5.14 with microcode update already
loaded and one existing host gets updated to kernel >= 5.14.
4. After maintenance event, the impacted host will not have HLE and RTM
exposed, and live migrations with guests with TSX features might not
migrate.

Users using tsx=on or CONFIG_X86_INTEL_TSX_MODE_ON should always see
HLE and RTM on capable Intel SKUs, even if microcode has been clubbed to
prevent functionality.

Users using tsx=auto get or CONFIG_X86_INTEL_TSX_MODE_AUTO get to roll the
dice with whatever the kernel believes the appropriate default is, which
includes the feature disappearing after a kernel and/or microcode update.
These users should consider masking HLE and RTM at a higher control plane
level, e.g. qemu or libvirt, such that guests on TSX enabled systems do not
see HLE/RTM and therefore do not enable TAA mitigation.

Fixes: 293649307ef9 ("x86/tsx: Clear CPUID bits when TSX always force aborts")

Signed-off-by: Jon Kohler <jon@xxxxxxxxxxx>
Cc: Pawan Gupta <pawan.kumar.gupta@xxxxxxxxxxxxxxx>
Cc: Borislav Petkov <bp@xxxxxxx>
Cc: Thomas Gleixner <tglx@xxxxxxxxxxxxx>
Cc: Andi Kleen <ak@xxxxxxxxxxxxxxx>
Cc: Tony Luck <tony.luck@xxxxxxxxx>
Cc: Neelima Krishnan <neelima.krishnan@xxxxxxxxx>
Cc: kvm@xxxxxxxxxxxxxxx <kvm@xxxxxxxxxxxxxxx>
---
v1 -> v2:
- Addressed comments on approach from Dave.

arch/x86/kernel/cpu/tsx.c | 30 +++++++++++++++---------------
1 file changed, 15 insertions(+), 15 deletions(-)

diff --git a/arch/x86/kernel/cpu/tsx.c b/arch/x86/kernel/cpu/tsx.c
index 9c7a5f049292..4b701fa64869 100644
--- a/arch/x86/kernel/cpu/tsx.c
+++ b/arch/x86/kernel/cpu/tsx.c
@@ -78,6 +78,10 @@ static bool __init tsx_ctrl_is_supported(void)

static enum tsx_ctrl_states x86_get_tsx_auto_mode(void)
{
+ if (boot_cpu_has(X86_FEATURE_RTM_ALWAYS_ABORT) &&
+ boot_cpu_has(X86_FEATURE_TSX_FORCE_ABORT))
+ return TSX_CTRL_RTM_ALWAYS_ABORT;
+
if (boot_cpu_has_bug(X86_BUG_TAA))
return TSX_CTRL_DISABLE;

@@ -105,21 +109,6 @@ void __init tsx_init(void)
char arg[5] = {};
int ret;

- /*
- * Hardware will always abort a TSX transaction if both CPUID bits
- * RTM_ALWAYS_ABORT and TSX_FORCE_ABORT are set. In this case, it is
- * better not to enumerate CPUID.RTM and CPUID.HLE bits. Clear them
- * here.
- */
- if (boot_cpu_has(X86_FEATURE_RTM_ALWAYS_ABORT) &&
- boot_cpu_has(X86_FEATURE_TSX_FORCE_ABORT)) {
- tsx_ctrl_state = TSX_CTRL_RTM_ALWAYS_ABORT;
- tsx_clear_cpuid();
- setup_clear_cpu_cap(X86_FEATURE_RTM);
- setup_clear_cpu_cap(X86_FEATURE_HLE);
- return;
- }
-
if (!tsx_ctrl_is_supported()) {
tsx_ctrl_state = TSX_CTRL_NOT_SUPPORTED;
return;
@@ -173,5 +162,16 @@ void __init tsx_init(void)
*/
setup_force_cpu_cap(X86_FEATURE_RTM);
setup_force_cpu_cap(X86_FEATURE_HLE);
+ } else if (tsx_ctrl_state == TSX_CTRL_RTM_ALWAYS_ABORT) {
+
+ /*
+ * Hardware will always abort a TSX transaction if both CPUID bits
+ * RTM_ALWAYS_ABORT and TSX_FORCE_ABORT are set. In this case, it is
+ * better not to enumerate CPUID.RTM and CPUID.HLE bits. Clear them
+ * here.
+ */
+ tsx_clear_cpuid();
+ setup_clear_cpu_cap(X86_FEATURE_RTM);
+ setup_clear_cpu_cap(X86_FEATURE_HLE);
}
}
--
2.30.1 (Apple Git-130)