Re: [PATCH v2 02/49] KVM: x86: Explicitly do runtime CPUID updates "after" initial setup

From: Maxim Levitsky
Date: Thu Jul 04 2024 - 20:51:58 EST


On Fri, 2024-05-17 at 10:38 -0700, Sean Christopherson wrote:
> Explicitly perform runtime CPUID adjustments as part of the "after set
> CPUID" flow to guard against bugs where KVM consumes stale vCPU/CPUID
> state during kvm_update_cpuid_runtime(). E.g. see commit 4736d85f0d18
> ("KVM: x86: Use actual kvm_cpuid.base for clearing KVM_FEATURE_PV_UNHALT").
>
> Whacking each mole individually is not sustainable or robust, e.g. while
> the aforemention commit fixed KVM's PV features, the same issue lurks for
> Xen and Hyper-V features, Xen and Hyper-V simply don't have any runtime
> features (though spoiler alert, neither should KVM).

>
> Updating runtime features in the "full" path will also simplify adding a
> snapshot of the guest's capabilities, i.e. of caching the intersection of
> guest CPUID and kvm_cpu_caps (modulo a few edge cases).
>
> Signed-off-by: Sean Christopherson <seanjc@xxxxxxxxxx>
> ---
> arch/x86/kvm/cpuid.c | 13 +++++++++++--
> 1 file changed, 11 insertions(+), 2 deletions(-)
>
> diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
> index 2b19ff991ceb..e60ffb421e4b 100644
> --- a/arch/x86/kvm/cpuid.c
> +++ b/arch/x86/kvm/cpuid.c
> @@ -345,6 +345,8 @@ void kvm_vcpu_after_set_cpuid(struct kvm_vcpu *vcpu)
> bitmap_zero(vcpu->arch.governed_features.enabled,
> KVM_MAX_NR_GOVERNED_FEATURES);
>
> + kvm_update_cpuid_runtime(vcpu);
> +
> /*
> * If TDP is enabled, let the guest use GBPAGES if they're supported in
> * hardware. The hardware page walker doesn't let KVM disable GBPAGES,
> @@ -426,8 +428,6 @@ static int kvm_set_cpuid(struct kvm_vcpu *vcpu, struct kvm_cpuid_entry2 *e2,
> {
> int r;
>
> - __kvm_update_cpuid_runtime(vcpu, e2, nent);
> -
> /*
> * KVM does not correctly handle changing guest CPUID after KVM_RUN, as
> * MAXPHYADDR, GBPAGES support, AMD reserved bit behavior, etc.. aren't
> @@ -440,6 +440,15 @@ static int kvm_set_cpuid(struct kvm_vcpu *vcpu, struct kvm_cpuid_entry2 *e2,
> * whether the supplied CPUID data is equal to what's already set.
> */
> if (kvm_vcpu_has_run(vcpu)) {
> + /*
> + * Note, runtime CPUID updates may consume other CPUID-driven
> + * vCPU state, e.g. KVM or Xen CPUID bases. Updating runtime
> + * state before full CPUID processing is functionally correct
> + * only because any change in CPUID is disallowed, i.e. using
> + * stale data is ok because KVM will reject the change.
> + */

If I understand correctly the sole reason for the below __kvm_update_cpuid_runtime
is to ensure that kvm_cpuid_check_equal doesn't fail because current cpuid also
was post-processed with runtime updates.

Can we have a comment stating this? Or even better how about moving the
call to __kvm_update_cpuid_runtime into the kvm_cpuid_check_equal,
to emphasize this?


> + __kvm_update_cpuid_runtime(vcpu, e2, nent);
> +
> r = kvm_cpuid_check_equal(vcpu, e2, nent);
> if (r)
> return r;



Overall I am not 100% sure what is better:

Before the patch it was roughly like this:

1. Post process the user given cpuid with bits of KVM runtime state (like xcr0)
At that point the vcpu->arch.cpuid_entries is stale but consistent, it is just old CPUID.

2. kvm_hv_vcpu_init call (IMHO this call can be moved to kvm_vcpu_after_set_cpuid)

3. kvm_check_cpuid on the user provided cpuid

4. Update the vcpu->arch.cpuid_entries with new and post processed cpuid

5. kvm_get_hypervisor_cpuid - I think this also can be cosmetically moved to kvm_vcpu_after_set_cpuid

6. kvm_vcpu_after_set_cpuid itself.


After this change it works like that:

1. kvm_hv_vcpu_init (again this belongs more to kvm_vcpu_after_set_cpuid)
2. kvm_check_cpuid on the user cpuid without post processing - in theory this can cause bugs
3. Update the vcpu->arch.cpuid_entries with new cpuid but without post-processing
4. kvm_get_hypervisor_cpuid
5. kvm_update_cpuid_runtime
6. The old kvm_vcpu_after_set_cpuid

I'm honestly not sure what is better but IMHO moving the kvm_hv_vcpu_init and kvm_get_hypervisor_cpuid into
kvm_vcpu_after_set_cpuid would clean up this mess a bit regardless of this patch.

Best regards,
Maxim Levitsky