Re: [PATCH v5 09/26] KVM: VMX: nVMX: Support TSC scaling and PERF_GLOBAL_CTRL with enlightened VMCS
From: Vitaly Kuznetsov
Date: Mon Aug 22 2022 - 04:48:06 EST
Sean Christopherson <seanjc@xxxxxxxxxx> writes:
> On Fri, Aug 19, 2022, Vitaly Kuznetsov wrote:
>> Sean Christopherson <seanjc@xxxxxxxxxx> writes:
>>
>> > On Tue, Aug 02, 2022, Vitaly Kuznetsov wrote:
>> >> +static u32 evmcs_get_unsupported_ctls(struct kvm_vcpu *vcpu,
>> >> + enum evmcs_unsupported_ctrl_type ctrl_type)
>> >> +{
>> >> + struct kvm_vcpu_hv *hv_vcpu = to_hv_vcpu(vcpu);
>> >> + enum evmcs_revision evmcs_rev = EVMCSv1_2016;
>> >> +
>> >> + if (!hv_vcpu)
>> >
>> > This is a functiontal change, and I don't think it's correct either. Previously,
>> > KVM would apply the EVMCSv1_2016 filter irrespective of whether or not
>> > vcpu->arch.hyperv is non-NULL. nested_enable_evmcs() doesn't require a Hyper-V
>> > vCPU, and AFAICT nothing requires a Hyper-V vCPU to use eVMCS.
>>
>> Indeed, this *is* correct after PATCH11 when we get rid of VMX feature
>> MSR filtering for KVM-on-Hyper-V as the remaining use for
>> evmcs_get_unsupported_ctls() is Hyper-V on KVM and hv_vcpu is not NULL
>> there.
>
> Hmm, nested_vmx_handle_enlightened_vmptrld() will fail without a Hyper-V vCPU, so
> filtering eVMCS control iff there's a Hyper-V vCPU makes sense. But that's a guest
> visible change and should be a separate patch.
>
Yes, the change you suggested:
if (hv_vcpu &&
hv_vcpu->cpuid_cache.nested_eb & HV_X64_NESTED_EVMCS1_2022_UPDATE)
evmcs_rev = EVMCSv1_2022;
seems to keep the status quo so we can discuss dropping filtering when
!hv_vcpu separately.
> But that also raises the question of whether or not KVM should honor hyperv_enabled
> when filtering MSRs. Same question for nested VM-Enter. nested_enlightened_vmentry()
> will "fail" without an assist page, and the guest can't set the assist page without
> hyperv_enabled==true, but nothing prevents the host from stuffing the assist page.
The case sounds more like a misbehaving VMM to me. It would probably be
better to fail nested_enlightened_vmentry() immediately on !hyperv_enabled.
>
> And on a very related topic, the handling of kvm_hv_vcpu_init() in kvm_hv_set_cpuid()
> is buggy. KVM will not report an error to userspace for KVM_SET_CPUID2 if allocation
> fails. If a later operation successfully create a Hyper-V vCPU, KVM will chug along
> with Hyper-V enabled but without having cached the relevant Hyper-V
> CPUID info.
Indeed, that's probably because kvm_vcpu_after_set_cpuid() itself is
never supposed to fail and thus returns 'void'. I'm not up-to-date on
the discussion whether small allocations can actually fail (and whether
2832 bytes for 'struct kvm_vcpu_hv' is 'small') but propagating -ENOMEM
all the way up to VMM is likely the right way to go.
>
> Less important is that kvm_hv_set_cpuid() should also zero out the CPUID cache if
> Hyper-V is disabled. I'm pretty sure sure all paths check hyperv_enabled before
> consuming cpuid_cache, but it's unnecessarily risky.
+1
>
> If we fix the kvm_hv_set_cpuid() allocation failure, then we can also guarantee
> that vcpu->arch.hyperv is non-NULL if vcpu->arch.hyperv_enabled==true. And then
> we can add gate guest eVMCS flow on hyperv_enabled, and evmcs_get_unsupported_ctls()
> can then WARN if hv_vcpu is NULL.
>
Alternatively, we can just KVM_BUG_ON() in kvm_hv_set_cpuid() when
allocation fails, at least for the time being as the VM is likely
useless anyway.
> Assuming I'm not overlooking something, I'll fold in yet more patches.
>
Thanks for the thorough review here and don't hesitate to speak up when
you think it's too much of a change to do upon queueing)
--
Vitaly