[RFC PATCH v3 00/10] Virtualize Intel IA32_SPEC_CTRL
From: Chao Gao
Date: Wed Apr 10 2024 - 10:35:26 EST
Hi all,
This series is tagged as RFC because I want to seek your feedback on
1. the KVM<->userspace ABI defined in patch 1
I am wondering if we can allow the userspace to configure the mask
and the shadow value during guest's lifetime and do it on a vCPU basis.
this way, in conjunction with "virtual MSRs" or any other interfaces,
the usespace can adjust hardware mitigations applied to the guest during
guest's lifetime e.g., for the best performance.
2. Intel-defined virtual MSRs vs. a new interface
The situation is some other OS already adopts the Intel-defined virtual
MSRs. Given this, I am not sure whether defining a new interface is
still preferable, as it will add more complexities if we end up with two
interfaces for the same purpose.
So, I just want to reconfirm whether the suggestion remains to define a
new interface through community collaboration as suggested at [1].
Below is the cover letter:
Background
==========
Branch History Injection (BHI) is a special form of Spectre variant 2,
where an attacker may manipulate branch history before transitioning
from user to supervisor mode (or from VMX non-root/guest to root mode)
in an effort to cause an indirect branch predictor to select a specific
predictor entry for an indirect branch, and a disclosure gadget at the
predicted target will transiently execute.
To mitigate BHI attacks, the kernel may use the hardware mitigation, i.e.,
BHI_DIS_S or resort to a SW loop, i.e., the BHB-clearing sequence, when the
hardware mitigation is not supported.
Problem
=======
However, the SW loop is effective on pre-SPR parts but not on SPR and
future parts. This creates a mitigation effectiveness problem for virtual
machines:
Migrating a guest using the SW loop on a pre-SPR part to parts where
the SW loop is ineffective (e.g., a SPR or future part) makes the
guest become vulnerable to BHI.
[For bare-metal, it isn't a problem. because parts on which the SW loop
is ineffective always support BHI_DIS_S, which is a more preferable
mitigation than the SW loop.]
Solution
========
This series proposes QEMU+KVM to deploy BHI_DIS_S using "virtualize
IA32_SPEC_CTRL" for the guest if the SW loop is ineffective on the host.
Note that: "virtualize IA32_SPEC_CTRL" allows the VMM to prevent the
guest from changing some bits of IA32_SPEC_CTRL MSR w/o intercepting
guest's writes to the MSR.
This solution leads to a new problem:
Deploying BHI_DIS_S for the guest may cause unnecessary performance loss
if the guest is using other mitigations for BHI or doesn't care BHI
attacks at all.
To overcome this unnecessary performance loss, we want to allow the guest
to opt out of BHI_DIS_S in this case. the idea is to let the guest report
whether it is using the SW loop to KVM/QEMU. Then KVM/QEMU won't deploy
BHI_DIS_S for the guest if the SW loop isn't in use.
Intel defines a set of para-virtualized MSRs [2] for guests to report
software mitigation status. This series emulates the para-virtualized
MSRs in KVM.
Overall, the series has two parts:
1. patch 1-3: Define the KVM ABI for userspace VMMs (e.g., QEMU) to deploy
hardware mitigations for the guest to solve the mitigation effectivenss
problem when migrating guests across parts w/ different microarchitecture.
2. patch 4-10: Emulate virtual MSRs so that the guest can report software
mitigation status to avoid the unnecessary performance loss.
[1] https://lore.kernel.org/all/ZH9kwlg2Ac9IER7Y@xxxxxxxxxx/
[2] https://www.intel.com/content/www/us/en/developer/articles/technical/software-security-guidance/technical-documentation/branch-history-injection.html#inpage-nav-4
Chao Gao (4):
KVM: VMX: Cache IA32_SPEC_CTRL_SHADOW field of VMCS
KVM: nVMX: Enable SPEC_CTRL virtualizaton for vmcs02
KVM: VMX: Cache force_spec_ctrl_value/mask for each vCPU
KVM: VMX: Advertise MITI_ENUM_RETPOLINE_S_SUPPORT
Daniel Sneddon (1):
KVM: VMX: Virtualize Intel IA32_SPEC_CTRL
Pawan Gupta (2):
x86/bugs: Use Virtual MSRs to request BHI_DIS_S
x86/bugs: Use Virtual MSRs to request RRSBA_DIS_S
Zhang Chen (3):
KVM: x86: Advertise ARCH_CAP_VIRTUAL_ENUM support
KVM: VMX: Advertise MITIGATION_CTRL support
KVM: VMX: Advertise MITI_CTRL_BHB_CLEAR_SEQ_S_SUPPORT
Documentation/virt/kvm/api.rst | 39 +++++++
arch/x86/include/asm/kvm_host.h | 4 +
arch/x86/include/asm/msr-index.h | 24 +++++
arch/x86/include/asm/vmx.h | 5 +
arch/x86/include/asm/vmxfeatures.h | 2 +
arch/x86/kernel/cpu/bugs.c | 33 ++++++
arch/x86/kernel/cpu/common.c | 1 +
arch/x86/kernel/cpu/cpu.h | 1 +
arch/x86/kvm/svm/svm.c | 3 +
arch/x86/kvm/vmx/capabilities.h | 5 +
arch/x86/kvm/vmx/nested.c | 30 ++++++
arch/x86/kvm/vmx/vmx.c | 162 +++++++++++++++++++++++++++--
arch/x86/kvm/vmx/vmx.h | 21 +++-
arch/x86/kvm/x86.c | 49 ++++++++-
arch/x86/kvm/x86.h | 1 +
include/uapi/linux/kvm.h | 4 +
16 files changed, 376 insertions(+), 8 deletions(-)
base-commit: 2c71fdf02a95b3dd425b42f28fd47fb2b1d22702
--
2.39.3