Re: [PATCH v2 1/7] KVM: nVMX: Introduce nested_evmcs_is_used()
From: Vitaly Kuznetsov
Date: Mon May 24 2021 - 08:35:34 EST
Maxim Levitsky <mlevitsk@xxxxxxxxxx> writes:
> On Mon, 2021-05-17 at 15:50 +0200, Vitaly Kuznetsov wrote:
>> Unlike regular set_current_vmptr(), nested_vmx_handle_enlightened_vmptrld()
>> can not be called directly from vmx_set_nested_state() as KVM may not have
>> all the information yet (e.g. HV_X64_MSR_VP_ASSIST_PAGE MSR may not be
>> restored yet). Enlightened VMCS is mapped later while getting nested state
>> pages. In the meantime, vmx->nested.hv_evmcs remains NULL and using it
>> for various checks is incorrect. In particular, if KVM_GET_NESTED_STATE is
>> called right after KVM_SET_NESTED_STATE, KVM_STATE_NESTED_EVMCS flag in the
>> resulting state will be unset (and such state will later fail to load).
>>
>> Introduce nested_evmcs_is_used() and use 'is_guest_mode(vcpu) &&
>> vmx->nested.current_vmptr == -1ull' check to detect not-yet-mapped eVMCS
>> after restore.
>>
>> Signed-off-by: Vitaly Kuznetsov <vkuznets@xxxxxxxxxx>
>> ---
>> arch/x86/kvm/vmx/nested.c | 31 ++++++++++++++++++++++++++-----
>> 1 file changed, 26 insertions(+), 5 deletions(-)
>>
>> diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
>> index 6058a65a6ede..3080e00c8f90 100644
>> --- a/arch/x86/kvm/vmx/nested.c
>> +++ b/arch/x86/kvm/vmx/nested.c
>> @@ -141,6 +141,27 @@ static void init_vmcs_shadow_fields(void)
>> max_shadow_read_write_fields = j;
>> }
>>
>> +static inline bool nested_evmcs_is_used(struct vcpu_vmx *vmx)
>> +{
>> + struct kvm_vcpu *vcpu = &vmx->vcpu;
>> +
>> + if (vmx->nested.hv_evmcs)
>> + return true;
>> +
>> + /*
>> + * After KVM_SET_NESTED_STATE, enlightened VMCS is mapped during
>> + * KVM_REQ_GET_NESTED_STATE_PAGES handling and until the request is
>> + * processed vmx->nested.hv_evmcs is NULL. It is, however, possible to
>> + * detect such state by checking 'nested.current_vmptr == -1ull' when
>> + * vCPU is in guest mode, it is only possible with eVMCS.
>> + */
>> + if (unlikely(vmx->nested.enlightened_vmcs_enabled && is_guest_mode(vcpu) &&
>> + (vmx->nested.current_vmptr == -1ull)))
>> + return true;
>> +
>> + return false;
>> +}
>
>
> I think that this is a valid way to solve the issue,
> but it feels like there might be a better way.
> I don't mind though to accept this patch as is.
>
> So here are my 2 cents about this:
>
> First of all after studying how evmcs works I take my words back
> about needing to migrate its contents.
>
> It is indeed enough to migrate its physical address,
> or maybe even just a flag that evmcs is loaded
> (and to my surprise we already do this - KVM_STATE_NESTED_EVMCS)
>
> So how about just having a boolean flag that indicates that evmcs is in use,
> but doesn't imply that we know its address or that it is mapped
> to host address space, something like 'vmx->nested.enlightened_vmcs_loaded'
>
> On migration that flag saved and restored as the KVM_STATE_NESTED_EVMCS,
> otherwise it set when we load an evmcs and cleared when it is released.
>
> Then as far as I can see we can use this flag in nested_evmcs_is_used
> since all its callers don't touch evmcs, thus don't need it to be
> mapped.
>
> What do you think?
>
First, we need to be compatible with older KVMs which don't have the
flag and this is problematic: currently, we always expect vmcs12 to
carry valid contents. This is challenging.
Second, vCPU can be migrated in three different states:
1) While L2 was running ('true' nested state is in VMCS02)
2) While L1 was running ('true' nested state is in eVMCS)
3) Right after an exit from L2 to L1 was forced
('need_vmcs12_to_shadow_sync = true') ('true' nested state is in
VMCS12).
The current solution is to always use VMCS12 as a container to transfer
the state and conceptually, it is at least easier to understand.
We can, indeed, transfer eVMCS (or VMCS12) in case 2) through guest
memory and I even tried that but that was making the code more complex
so eventually I gave up and decided to preserve the 'always use VMCS12
as a container' status quo.
--
Vitaly