Re: [PATCH] KVM: x86: Set BHI_NO in guest when host is not affected by BHI

From: Alexandre Chartre
Date: Mon Apr 15 2024 - 11:15:09 EST



On 4/11/24 15:20, Alexandre Chartre wrote:

On 4/11/24 13:14, Chao Gao wrote:
The problem is that we can end up with a guest running extra BHI
mitigations
while this is not needed. Could we inform the guest that eIBRS is not
available
on the system so a Linux guest doesn't run with extra BHI mitigations?

Well, that's why Intel specified some MSRs at 0x5000xxxx.

Yes. But note that there is a subtle difference. Those MSRs are used for guest
to communicate in-used software mitigations to the host. Such information is
stable across migration. Here we need the host to communicate that eIBRS isn't
available to the guest. this isn't stable as the guest may be migrated from
a host without eIBRS to one with it.


Except I don't know anyone currently interested in implementing them,
and I'm still not sure if they work correctly for some of the more
complicated migration cases.

Looks you have the same opinion on the Intel-defined virtual MSRs as Sean.
If we all agree the issue here and the effectivenss problem of the short
BHB-clearing sequence need to be resolved and don't think the Intel-defined
virtual MSRs can handle all cases correctly, we have to define a better
interface through community collaboration as Sean suggested.

Another solution could be to add cpus to cpu_vuln_whitelist with BHI_NO.
(e.g. explicitly add cpus which have eIBRS). That way, the kernel will
figure out the right mitigation on the host and guest.


More precisely we could something like this (this is just an example, obviously
the list is clearly incomplete):

diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
index 754d91857d63..80477170ccc0 100644
--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -1182,6 +1182,24 @@ static const __initconst struct x86_cpu_id cpu_vuln_whitelist[] = {
VULNWL_INTEL(ATOM_TREMONT_L, NO_EIBRS_PBRSB),
VULNWL_INTEL(ATOM_TREMONT_D, NO_ITLB_MULTIHIT | NO_EIBRS_PBRSB),
+ /*
+ * The following Intel CPUs are affected by BHI, but they don't have
+ * the eIBRS feature. In that case, the default Spectre v2 mitigations
+ * are enough to also mitigate BHI. We mark these CPUs with NO_BHI so
+ * that X86_BUG_BHI doesn't get set and no extra BHI mitigation is
+ * enabled.
+ *
+ * This avoids guest VMs from enabling extra BHI mitigation when this
+ * is not needed. For guest, X86_BUG_BHI is never set for CPUs which
+ * don't have the eIBRS feature. But this doesn't happen in guest VMs
+ * as the virtualization can hide the eIBRS feature.
+ */
+ VULNWL_INTEL(IVYBRIDGE_X, NO_BHI),
+ VULNWL_INTEL(HASWELL_X, NO_BHI),
+ VULNWL_INTEL(BROADWELL_X, NO_BHI),
+ VULNWL_INTEL(SKYLAKE_X, NO_BHI),
+ VULNWL_INTEL(SKYLAKE_X, NO_BHI),
+
/* AMD Family 0xf - 0x12 */
VULNWL_AMD(0x0f, NO_MELTDOWN | NO_SSB | NO_L1TF | NO_MDS | NO_SWAPGS | NO_ITLB_MULTIHIT | NO_MMIO | NO_BHI),
VULNWL_AMD(0x10, NO_MELTDOWN | NO_SSB | NO_L1TF | NO_MDS | NO_SWAPGS | NO_ITLB_MULTIHIT | NO_MMIO | NO_BH


alex.