[PATCH 1/2] KVM: nVMX: fix CR4_READ_SHADOW when L0 updates CR4 during a signal

From: Julian Stecklina
Date: Tue Apr 16 2024 - 08:37:02 EST


From: Thomas Prescher <thomas.prescher@xxxxxxxxxxxxxxxxxxxxx>

This issue occurs when the kernel is interrupted by a signal while
running a L2 guest. If the signal is meant to be delivered to the L0
VMM, and L0 updates CR4 for L1, i.e. when the VMM sets
KVM_SYNC_X86_SREGS in kvm_run->kvm_dirty_regs, the kernel programs an
incorrect read shadow value for L2's CR4.

The result is that the guest can read a value for CR4 where bits from
L1 have leaked into L2.

We found this issue by running uXen [1] as L2 in VirtualBox/KVM [2].
The issue can also easily be reproduced in Qemu/KVM if we force a sreg
sync on each call to KVM_RUN [3]. The issue can also be reproduced by
running a L2 Windows 10. In the Windows case, CR4.VMXE leaks from L1
to L2 causing the OS to blue-screen with a kernel thread exception
during TLB invalidation where the following code sequence triggers the
issue:

mov rax, cr4 <--- L2 reads CR4 with contents from L1
mov rcx, cr4
btc 0x7, rax <--- L2 toggles CR4.PGE
mov cr4, rax <--- #GP because L2 writes CR4 with reserved bits set
mov cr4, rcx

The existing code seems to fixup CR4_READ_SHADOW after calling
vmx_set_cr4 except in __set_sregs_common. While we could fix it there
as well, it's easier to just handle it centrally.

There might be a similar issue with CR0.

[1] https://github.com/OpenXT/uxen
[2] https://github.com/cyberus-technology/virtualbox-kvm
[3] https://github.com/tpressure/qemu/commit/d64c9d5e76f3f3b747bea7653d677bd61e13aafe

Signed-off-by: Julian Stecklina <julian.stecklina@xxxxxxxxxxxxxxxxxxxxx>
Signed-off-by: Thomas Prescher <thomas.prescher@xxxxxxxxxxxxxxxxxxxxx>
---
arch/x86/kvm/vmx/vmx.c | 6 +++++-
1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index 6780313914f8..0d4af00245f3 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -3474,7 +3474,11 @@ void vmx_set_cr4(struct kvm_vcpu *vcpu, unsigned long cr4)
hw_cr4 &= ~(X86_CR4_SMEP | X86_CR4_SMAP | X86_CR4_PKE);
}

- vmcs_writel(CR4_READ_SHADOW, cr4);
+ if (is_guest_mode(vcpu))
+ vmcs_writel(CR4_READ_SHADOW, nested_read_cr4(get_vmcs12(vcpu)));
+ else
+ vmcs_writel(CR4_READ_SHADOW, cr4);
+
vmcs_writel(GUEST_CR4, hw_cr4);

if ((cr4 ^ old_cr4) & (X86_CR4_OSXSAVE | X86_CR4_PKE))
--
2.43.2