Re: [PATCH] KVM: nVMX: Fix CR4 after VMLAUNCH/VMRESUME failure
From: Wanpeng Li
Date: Sun Feb 11 2018 - 06:56:41 EST
2018-02-08 23:29 GMT+08:00 Jim Mattson <jmattson@xxxxxxxxxx>:
> Consider the following scenario:
>
> L1 has never successfully executed VMLAUNCH. It has written 0 to
> vmcs12's host CR3 field using VMWRITE, but the current host CR3 value
Writes 0 to cr3 can't be detected during vmentry checks by hardware.
> is actually 3e7000. It has written some illegal control field that the
> L0 KVM doesn't check itself, but defers to the hardware checks on
> vmcs02 instead. So, when L1 tries to execute VMLAUNCH, L0 follows this
> path for "VM-entry to vmcs02 failed due to invalid control field(s)."
> Your change would set CR3 to 0, which is incorrect. CR3 should
> actually be set to 3e7000. Now, if L0 is sane and using EPT, then it
> can find the correct L1 CR3 value in vmcs01's Guest CR3 field, but if
> for some reason L0 is using shadow paging to execute L1, that won't
> work. Similarly, the correct L1 CR4 value should be in vmcs01's CR4
> read shadow field.
>
> You can't just assume that L1 has written values to the vmcs12 host
> fields that actually match the current host values. There is nothing
> in the architecture that would require this behavior.
>
> On Wed, Feb 7, 2018 at 10:22 PM, Wanpeng Li <kernellwp@xxxxxxxxx> wrote:
>> 2018-02-08 0:57 GMT+08:00 Jim Mattson <jmattson@xxxxxxxxxx>:
>>> vmcs12->host_cr[34] does not contain the up-to-date values when L1 is
>>> running. L1 can vmwrite any values there. We know at this point that
>>
>> It will incur a vmexit to emulate L1 vmwrites vmcs12->host_cr[34] even
>> if vmcs shadow is enabled since host_cr[34] is not shadowed in the
>> bitmap, why it is not up-to-date when L1 is running?
>>
>> Regards,
>> Wanpeng Li
>>
>>> they are legal (because we checked them), but that's about it. If the
>>> VMLAUNCH/VMRESUME of vmcs12 fails for "invalid control field," there
>>> is no VM-exit from L2 to L1, and these fields are not loaded. Instead,
>>> execution just falls through to the next instruction with VMFailValid
>>> semantics.
>>>
>>> On Wed, Feb 7, 2018 at 12:31 AM, Wanpeng Li <kernellwp@xxxxxxxxx> wrote:
>>>> 2018-02-07 0:58 GMT+08:00 Jim Mattson <jmattson@xxxxxxxxxx>:
>>>>> On Mon, Feb 5, 2018 at 4:57 PM, Wanpeng Li <kernellwp@xxxxxxxxx> wrote:
>>>>>
>>>>>> This is effective one, what I restore in this patch is
>>>>>> achitectural/guest visible.
>>>>>
>>>>> This patch doesn't "restore" the guest visible CR4 to its value at the
>>>>> time of VMLAUNCH/VMRESUME. It loads a new CR4 value from the vmcs12.
>>>>> That behavior is incorrect.
>>>>
>>>> You have another pointing out about this.
>>>> https://lkml.org/lkml/2018/2/5/518 vmcs12->host_cr3/host_cr4 has the
>>>> up-to-date value when L1 is running, it is still up-to-date after
>>>> vmexit due to L1 executes VMLAUNCH/VMRESUME, I think the value stays
>>>> the same before L0 emulates the VMLAUNCH/VMRESUME, according to below
>>>> comments, why vmcs12->host_cr3/cr4 is not the value which we should
>>>> restore?
>>>>
>>>> * After an early L2 VM-entry failure, we're now back
>>>> * in L1 which thinks it just finished a VMLAUNCH or
>>>> * VMRESUME instruction
>>>>
>>>> Regards,
>>>> Wanpeng Li