[PATCH v4 03/34] KVM: selftests: Add a shameful hack to preserve/clobber GPRs across ucall

From: Sean Christopherson
Date: Fri Jul 28 2023 - 20:37:05 EST


Preserve or clobber all GPRs (except RIP and RSP, as they're saved and
restored via the VMCS) when performing a ucall on x86 to fudge around a
horrific long-standing bug in selftests' nested VMX support where L2's
GPRs are not preserved across a nested VM-Exit. I.e. if a test triggers a
nested VM-Exit to L1 in response to a ucall, e.g. GUEST_SYNC(), then L2's
GPR state can be corrupted.

The issues manifests as an unexpected #GP in clear_bit() when running the
hyperv_evmcs test due to RBX being used to track the ucall object, and RBX
being clobbered by the nested VM-Exit. The problematic hyperv_evmcs
testcase is where L0 (test's host userspace) injects an NMI in response to
GUEST_SYNC(8) from L2, but the bug could "randomly" manifest in any test
that induces a nested VM-Exit from L0. The bug hasn't caused failures in
the past due to sheer dumb luck.

The obvious fix is to rework the nVMX helpers to save/restore L2 GPRs
across VM-Exit and VM-Enter, but that is a much bigger task and carries
its own risks, e.g. nSVM does save/restore GPRs, but not in a thread-safe
manner, and there is a _lot_ of cleanup that can be done to unify code
for doing VM-Enter on nVMX, nSVM, and eVMCS.

Signed-off-by: Sean Christopherson <seanjc@xxxxxxxxxx>
---
.../testing/selftests/kvm/lib/x86_64/ucall.c | 32 +++++++++++++++++--
1 file changed, 30 insertions(+), 2 deletions(-)

diff --git a/tools/testing/selftests/kvm/lib/x86_64/ucall.c b/tools/testing/selftests/kvm/lib/x86_64/ucall.c
index 4d41dc63cc9e..a53df3ece2f8 100644
--- a/tools/testing/selftests/kvm/lib/x86_64/ucall.c
+++ b/tools/testing/selftests/kvm/lib/x86_64/ucall.c
@@ -14,8 +14,36 @@ void ucall_arch_init(struct kvm_vm *vm, vm_paddr_t mmio_gpa)

void ucall_arch_do_ucall(vm_vaddr_t uc)
{
- asm volatile("in %[port], %%al"
- : : [port] "d" (UCALL_PIO_PORT), "D" (uc) : "rax", "memory");
+ /*
+ * FIXME: Revert this hack (the entire commit that added it) once nVMX
+ * preserves L2 GPRs across a nested VM-Exit. If a ucall from L2, e.g.
+ * to do a GUEST_SYNC(), lands the vCPU in L1, any and all GPRs can be
+ * clobbered by L1. Save and restore non-volatile GPRs (clobbering RBP
+ * in particular is problematic) along with RDX and RDI (which are
+ * inputs), and clobber volatile GPRs. *sigh*
+ */
+#define HORRIFIC_L2_UCALL_CLOBBER_HACK \
+ "rcx", "rsi", "r8", "r9", "r10", "r11"
+
+ asm volatile("push %%rbp\n\t"
+ "push %%r15\n\t"
+ "push %%r14\n\t"
+ "push %%r13\n\t"
+ "push %%r12\n\t"
+ "push %%rbx\n\t"
+ "push %%rdx\n\t"
+ "push %%rdi\n\t"
+ "in %[port], %%al\n\t"
+ "pop %%rdi\n\t"
+ "pop %%rdx\n\t"
+ "pop %%rbx\n\t"
+ "pop %%r12\n\t"
+ "pop %%r13\n\t"
+ "pop %%r14\n\t"
+ "pop %%r15\n\t"
+ "pop %%rbp\n\t"
+ : : [port] "d" (UCALL_PIO_PORT), "D" (uc) : "rax", "memory",
+ HORRIFIC_L2_UCALL_CLOBBER_HACK);
}

void *ucall_arch_get_ucall(struct kvm_vcpu *vcpu)
--
2.41.0.487.g6d72f3e995-goog