Re: [PATCH 1/2] KVM: X86: Move ignore_msrs handling upper the stack

From: Peter Xu
Date: Fri Jun 26 2020 - 15:11:34 EST


On Fri, Jun 26, 2020 at 11:18:20AM -0700, Sean Christopherson wrote:
> On Fri, Jun 26, 2020 at 02:07:32PM -0400, Peter Xu wrote:
> > On Thu, Jun 25, 2020 at 09:25:40AM -0700, Sean Christopherson wrote:
> > > diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
> > > index 5eb618dbf211..64322446e590 100644
> > > --- a/arch/x86/kvm/cpuid.c
> > > +++ b/arch/x86/kvm/cpuid.c
> > > @@ -1013,9 +1013,9 @@ bool kvm_cpuid(struct kvm_vcpu *vcpu, u32 *eax, u32 *ebx,
> > > *ebx = entry->ebx;
> > > *ecx = entry->ecx;
> > > *edx = entry->edx;
> > > - if (function == 7 && index == 0) {
> > > + if (function == 7 && index == 0 && (*ebx | (F(RTM) | F(HLE))) {
> > > u64 data;
> > > - if (!__kvm_get_msr(vcpu, MSR_IA32_TSX_CTRL, &data, true) &&
> > > + if (!kvm_get_msr(vcpu, MSR_IA32_TSX_CTRL, &data) &&
> > > (data & TSX_CTRL_CPUID_CLEAR))
> > > *ebx &= ~(F(RTM) | F(HLE));
> > > }
> > >
> > >
> > > On VMX, MSR_IA32_TSX_CTRL will be added to the so called shared MSR array
> > > regardless of whether or not it is being advertised to userspace (this is
> > > a bug in its own right). Using the host_initiated variant means KVM will
> > > incorrectly bypass VMX's ARCH_CAP_TSX_CTRL_MSR check, i.e. incorrectly
> > > clear the bits if userspace is being weird and stuffed MSR_IA32_TSX_CTRL
> > > without advertising it to the guest.
> >
> > Btw, would it be more staightforward to check "vcpu->arch.arch_capabilities &
> > ARCH_CAP_TSX_CTRL_MSR" rather than "*ebx | (F(RTM) | F(HLE))" even if we want
> > to have such a fix?
>
> Not really, That ends up duplicating the check in vmx_get_msr(). From an
> emulation perspective, this really is a "guest" access to the MSR, in the
> sense that it the virtual CPU is in the guest domain, i.e. not a god-like
> entity that gets to break the rules of emulation.

I can't say I agree that it's a guest behavior. IMHO kvm plays the role as the
virtual processor. If the bit in a cpuid entry depends on another MSR bit,
then the read of that MSR value is a "processor behavior", which in our case is
still a host behavior. It's exactly because we thought it was a guest behavior
so we got confused when we saw the error message of "ignored rdmsr" the first
time but see the guest has no reason to do so... So even if you want to keep
those error messages, I'd really appreciate if they can show something else so
we know it's not a guest rdmsr instruction.

To me, the existing tsx code is not a bug at all (IMHO the evil thing is the
tricky knobs and the fact that it hides deep, and that's why I really want to
move this series forward), and instead I think it's quite elegant to write
things like below...

if (!__kvm_read_msr(&data) && (data & XXX))
...

It's definitely subjective so I can't argu much... However it's slightly
similar to rdmsr_safe and friends in that we don't need to remember two flags
(cap+msr) but only the msr (and I bet I'm not the only one who likes it, just
see the massive callers of all the "safe" versioned msr friends...).

Considering the fact that we still have the unexpected warning message on some
hosts with upgraded firmwares which potentially breaks some realtime systems,
do you think below simple and clear patch acceptable to you?

diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
index 901cd1fdecd9..052c93997965 100644
--- a/arch/x86/kvm/cpuid.c
+++ b/arch/x86/kvm/cpuid.c
@@ -1005,7 +1005,8 @@ bool kvm_cpuid(struct kvm_vcpu *vcpu, u32 *eax, u32 *ebx,
*ebx = entry->ebx;
*ecx = entry->ecx;
*edx = entry->edx;
- if (function == 7 && index == 0) {
+ if (function == 7 && index == 0 &&
+ vcpu->arch.arch_capabilities & ARCH_CAP_TSX_CTRL_MSR) {
u64 data;
if (!__kvm_get_msr(vcpu, MSR_IA32_TSX_CTRL, &data, true) &&
(data & TSX_CTRL_CPUID_CLEAR))

Then we can further discuss whether and how we'd like to refactor the knobs and
around.

Thanks,

--
Peter Xu