Re: [PATCH v2 44/49] KVM: x86: Update guest cpu_caps at runtime for dynamic CPUID-based features

From: Maxim Levitsky
Date: Thu Nov 21 2024 - 21:11:47 EST


On Wed, 2024-09-11 at 08:41 -0700, Sean Christopherson wrote:
> On Tue, Sep 10, 2024, Maxim Levitsky wrote:
> > On Mon, 2024-07-08 at 17:24 -0700, Sean Christopherson wrote:
> > > On Thu, Jul 04, 2024, Maxim Levitsky wrote:
> > > > On Fri, 2024-05-17 at 10:39 -0700, Sean Christopherson wrote:
> > > > > - cpuid_entry_change(best, X86_FEATURE_OSPKE,
> > > > > - kvm_is_cr4_bit_set(vcpu, X86_CR4_PKE));
> > > > > + kvm_update_feature_runtime(vcpu, best, X86_FEATURE_OSPKE,
> > > > > + kvm_is_cr4_bit_set(vcpu, X86_CR4_PKE));
> > > > > +
> > > > >
> > > > > best = kvm_find_cpuid_entry_index(vcpu, 0xD, 0);
> > > > > if (best)
> > > >
> > > > I am not 100% sure that we need to do this.
> > > >
> > > > Runtime cpuid changes are a hack that Intel did back then, due to various
> > > > reasons, These changes don't really change the feature set that CPU supports,
> > > > but merly as you like to say 'massage' the output of the CPUID instruction to
> > > > make the unmodified OS happy usually.
> > > >
> > > > Thus it feels to me that CPU caps should not include the dynamic features,
> > > > and neither KVM should use the value of these as a source for truth, but
> > > > rather the underlying source of the truth (e.g CR4).
> > > >
> > > > But if you insist, I don't really have a very strong reason to object this.
> > >
> > > FWIW, I think I agree that CR4 should be the source of truth, but it's largely a
> > > moot point because KVM doesn't actually check OSXSAVE or OSPKE, as KVM never
> > > emulates the relevant instructions. So for those, it's indeed not strictly
> > > necessary.
> > >
> > > Unfortunately, KVM has established ABI for checking X86_FEATURE_MWAIT when
> > > "emulating" MONITOR and MWAIT, i.e. KVM can't use vcpu->arch.ia32_misc_enable_msr
> > > as the source of truth.
> >
> > Can you elaborate on this? Can you give me an example of the ABI?
>
> Writes to MSR_IA32_MISC_ENABLE are guarded with a quirk:
>
> if (!kvm_check_has_quirk(vcpu->kvm, KVM_X86_QUIRK_MISC_ENABLE_NO_MWAIT) &&
> ((old_val ^ data) & MSR_IA32_MISC_ENABLE_MWAIT)) {
> if (!guest_cpuid_has(vcpu, X86_FEATURE_XMM3))
> return 1;
> vcpu->arch.ia32_misc_enable_msr = data;
> kvm_update_cpuid_runtime(vcpu);
> } else {
> vcpu->arch.ia32_misc_enable_msr = data;
> }
>
> as is enforcement of #UD on MONITOR/MWAIT.
>
> static int kvm_emulate_monitor_mwait(struct kvm_vcpu *vcpu, const char *insn)
> {
> if (!kvm_check_has_quirk(vcpu->kvm, KVM_X86_QUIRK_MWAIT_NEVER_UD_FAULTS) &&
> !guest_cpuid_has(vcpu, X86_FEATURE_MWAIT))
> return kvm_handle_invalid_op(vcpu);
>
> pr_warn_once("%s instruction emulated as NOP!\n", insn);
> return kvm_emulate_as_nop(vcpu);
> }
>
> If KVM_X86_QUIRK_MISC_ENABLE_NO_MWAIT is enabled but KVM_X86_QUIRK_MWAIT_NEVER_UD_FAULTS
> is _disabled_, then KVM's ABI is to honor X86_FEATURE_MWAIT regardless of what
> is in vcpu->arch.ia32_misc_enable_msr (because userspace owns X86_FEATURE_MWAIT
> in that scenario).
>

OK, makes sense.
Best regards,
Maxim Levitsky