Re: [PATCH] KVM: nVMX: Fix CR4 after VMLAUNCH/VMRESUME failure

From: Wanpeng Li
Date: Mon Feb 05 2018 - 19:57:59 EST


2018-02-06 2:24 GMT+08:00 Jim Mattson <jmattson@xxxxxxxxxx>:
> [Resending as plain text]
>
> On Mon, Feb 5, 2018 at 10:21 AM Jim Mattson <jmattson@xxxxxxxxxx> wrote:
>
>> This is incorrect. In the event of an early VM-entry failure (e.g. a
>> VM-entry failure for "VM entry with invalid control field(s)"), no host
>> state should be loaded from the VMCS12. Of course, no guest state should
>> have been loaded from the VMCS12 either, but that's a problem we have with
>> deferring some VMCS12 control field checks to the hardware.
>
>> CR4 should be unchanged from the time of the VMLAUNCH/VMRESUME. There is

This is effective one, what I restore in this patch is
achitectural/guest visible.

Regards,
Wanpeng Li

>> no guarantee that vmcs12->host_cr4 holds the correct value.
>
>
>> On Mon, Feb 5, 2018 at 3:05 AM Wanpeng Li <kernellwp@xxxxxxxxx> wrote:
>
>>> From: Wanpeng Li <wanpengli@xxxxxxxxxxx>
>
>>> In L0, Haswell client host:
>
>>> nested_vmx_exit_reflected failed vm entry 7
>>> WARNING: CPU: 6 PID: 6797 at kvm/arch/x86/kvm//vmx.c:6206
>>> handle_desc+0x2d/0x40 [kvm_intel]
>>> CPU: 6 PID: 6797 Comm: qemu-system-x86 Tainted: G W OE
>>> 4.15.0+ #4
>>> RIP: 0010:handle_desc+0x2d/0x40 [kvm_intel]
>>> Call Trace:
>>> vmx_handle_exit+0xbd/0xe20 [kvm_intel]
>>> ? kvm_arch_vcpu_ioctl_run+0xcde/0x1c00 [kvm]
>>> kvm_arch_vcpu_ioctl_run+0xd5a/0x1c00 [kvm]
>>> kvm_vcpu_ioctl+0x3e9/0x720 [kvm]
>>> ? kvm_vcpu_ioctl+0x3e9/0x720 [kvm]
>>> ? __fget+0xfc/0x210
>>> ? __fget+0xfc/0x210
>>> do_vfs_ioctl+0xa4/0x6a0
>>> ? __fget+0x11d/0x210
>>> SyS_ioctl+0x79/0x90
>>> entry_SYSCALL_64_fastpath+0x25/0x9c
>
>>> This can be reproduced by running kvm-unit-tests/run_tests.sh
>>> vmx_controls in
>>> L1. UMIP CPUID bit is exposed to the L1 UMIP aware guest since it is
>>> emulated
>>> by enabling descriptor-table exits on L0. There is a vmentry fail when
>>> L0 tries to run L2 directly, the L1 guest architectural CR4 is not
>>> restored
>>> after this failure since commit 4f350c6dbcb (kvm: nVMX: Handle deferred
>>> early
>>> VMLAUNCH/VMRESUME failure properly). The L2 is kvm-unit-tests which will
>>> not
>>> write CR4 w/ X86_CR4_UMIP bit. After another L1 access descriptor vmexit,
>>> we
>>> check L2's architectural CR4 instead of L1's architectural CR4. This
>>> patch
>>> fixes it by restoring L1's architectural CR4 after L0's VMLAUNCH/VMRESUME
>>> failure.
>
>>> Cc: Paolo Bonzini <pbonzini@xxxxxxxxxx>
>>> Cc: Radim KrÄmÃÅ <rkrcmar@xxxxxxxxxx>
>>> Signed-off-by: Wanpeng Li <wanpengli@xxxxxxxxxxx>
>>> ---
>>> arch/x86/kvm/vmx.c | 1 +
>>> 1 file changed, 1 insertion(+)
>
>>> diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
>>> index 23789c9..9fc0492 100644
>>> --- a/arch/x86/kvm/vmx.c
>>> +++ b/arch/x86/kvm/vmx.c
>>> @@ -11633,6 +11633,7 @@ static void nested_vmx_vmexit(struct kvm_vcpu
>>> *vcpu, u32 exit_reason,
>>> */
>>> nested_vmx_failValid(vcpu, VMXERR_ENTRY_INVALID_CONTROL_FIELD);
>
>>> + vcpu->arch.cr4 = vmcs12->host_cr4;
>>> load_vmcs12_mmu_host_state(vcpu, vmcs12);
>
>>> /*
>>> --
>>> 2.7.4