Re: [PATCH] KVM: nVMX: Fix CR4 after VMLAUNCH/VMRESUME failure

From: Jim Mattson
Date: Mon Feb 05 2018 - 13:57:41 EST


[Resending as plain text]

On Mon, Feb 5, 2018 at 10:21 AM Jim Mattson <jmattson@xxxxxxxxxx> wrote:

> This is incorrect. In the event of an early VM-entry failure (e.g. a
> VM-entry failure for "VM entry with invalid control field(s)"), no host
> state should be loaded from the VMCS12. Of course, no guest state should
> have been loaded from the VMCS12 either, but that's a problem we have with
> deferring some VMCS12 control field checks to the hardware.

> CR4 should be unchanged from the time of the VMLAUNCH/VMRESUME. There is
> no guarantee that vmcs12->host_cr4 holds the correct value.


> On Mon, Feb 5, 2018 at 3:05 AM Wanpeng Li <kernellwp@xxxxxxxxx> wrote:

>> From: Wanpeng Li <wanpengli@xxxxxxxxxxx>

>> In L0, Haswell client host:

>> nested_vmx_exit_reflected failed vm entry 7
>> WARNING: CPU: 6 PID: 6797 at kvm/arch/x86/kvm//vmx.c:6206
>> handle_desc+0x2d/0x40 [kvm_intel]
>> CPU: 6 PID: 6797 Comm: qemu-system-x86 Tainted: G W OE
>> 4.15.0+ #4
>> RIP: 0010:handle_desc+0x2d/0x40 [kvm_intel]
>> Call Trace:
>> vmx_handle_exit+0xbd/0xe20 [kvm_intel]
>> ? kvm_arch_vcpu_ioctl_run+0xcde/0x1c00 [kvm]
>> kvm_arch_vcpu_ioctl_run+0xd5a/0x1c00 [kvm]
>> kvm_vcpu_ioctl+0x3e9/0x720 [kvm]
>> ? kvm_vcpu_ioctl+0x3e9/0x720 [kvm]
>> ? __fget+0xfc/0x210
>> ? __fget+0xfc/0x210
>> do_vfs_ioctl+0xa4/0x6a0
>> ? __fget+0x11d/0x210
>> SyS_ioctl+0x79/0x90
>> entry_SYSCALL_64_fastpath+0x25/0x9c

>> This can be reproduced by running kvm-unit-tests/run_tests.sh
>> vmx_controls in
>> L1. UMIP CPUID bit is exposed to the L1 UMIP aware guest since it is
>> emulated
>> by enabling descriptor-table exits on L0. There is a vmentry fail when
>> L0 tries to run L2 directly, the L1 guest architectural CR4 is not
>> restored
>> after this failure since commit 4f350c6dbcb (kvm: nVMX: Handle deferred
>> early
>> VMLAUNCH/VMRESUME failure properly). The L2 is kvm-unit-tests which will
>> not
>> write CR4 w/ X86_CR4_UMIP bit. After another L1 access descriptor vmexit,
>> we
>> check L2's architectural CR4 instead of L1's architectural CR4. This
>> patch
>> fixes it by restoring L1's architectural CR4 after L0's VMLAUNCH/VMRESUME
>> failure.

>> Cc: Paolo Bonzini <pbonzini@xxxxxxxxxx>
>> Cc: Radim KrÄmÃÅ <rkrcmar@xxxxxxxxxx>
>> Signed-off-by: Wanpeng Li <wanpengli@xxxxxxxxxxx>
>> ---
>> arch/x86/kvm/vmx.c | 1 +
>> 1 file changed, 1 insertion(+)

>> diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
>> index 23789c9..9fc0492 100644
>> --- a/arch/x86/kvm/vmx.c
>> +++ b/arch/x86/kvm/vmx.c
>> @@ -11633,6 +11633,7 @@ static void nested_vmx_vmexit(struct kvm_vcpu
>> *vcpu, u32 exit_reason,
>> */
>> nested_vmx_failValid(vcpu, VMXERR_ENTRY_INVALID_CONTROL_FIELD);

>> + vcpu->arch.cr4 = vmcs12->host_cr4;
>> load_vmcs12_mmu_host_state(vcpu, vmcs12);

>> /*
>> --
>> 2.7.4