Re: [PATCH 1/2] KVM: X86: Move ignore_msrs handling upper the stack
From: Sean Christopherson
Date: Thu Jun 25 2020 - 12:25:43 EST
On Thu, Jun 25, 2020 at 10:09:13AM +0200, Paolo Bonzini wrote:
> On 25/06/20 08:15, Sean Christopherson wrote:
> > IMO, kvm_cpuid() is simply buggy. If KVM attempts to access a non-existent
> > MSR then it darn well should warn.
> >
> > diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
> > index 8a294f9747aa..7ef7283011d6 100644
> > --- a/arch/x86/kvm/cpuid.c
> > +++ b/arch/x86/kvm/cpuid.c
> > @@ -1013,7 +1013,8 @@ bool kvm_cpuid(struct kvm_vcpu *vcpu, u32 *eax, u32 *ebx,
> > *ebx = entry->ebx;
> > *ecx = entry->ecx;
> > *edx = entry->edx;
> > - if (function == 7 && index == 0) {
> > + if (function == 7 && index == 0 && (*ebx | (F(RTM) | F(HLE))) &&
> > + (vcpu->arch.arch_capabilities & ARCH_CAP_TSX_CTRL_MSR)) {
> > u64 data;
> > if (!__kvm_get_msr(vcpu, MSR_IA32_TSX_CTRL, &data, true) &&
> > (data & TSX_CTRL_CPUID_CLEAR))
> >
>
> That works too, but I disagree that warning is the correct behavior
> here. It certainly should warn as long as kvm_get_msr blindly returns
> zero. However, for a guest it's fine to access a potentially
> non-existent MSR if you're ready to trap the #GP, and the point of this
> series is to let cpuid.c or any other KVM code do the same.
I get the "what" of the change, and even the "why" to some extent, but I
dislike the idea of supporting/encouraging blind reads/writes to MSRs.
Blind writes are just asking for problems, and suppressing warnings on reads
is almost guaranteed to be suppressing a KVM bug.
Case in point, looking at the TSX thing again, I actually think the fix
should be:
diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
index 5eb618dbf211..64322446e590 100644
--- a/arch/x86/kvm/cpuid.c
+++ b/arch/x86/kvm/cpuid.c
@@ -1013,9 +1013,9 @@ bool kvm_cpuid(struct kvm_vcpu *vcpu, u32 *eax, u32 *ebx,
*ebx = entry->ebx;
*ecx = entry->ecx;
*edx = entry->edx;
- if (function == 7 && index == 0) {
+ if (function == 7 && index == 0 && (*ebx | (F(RTM) | F(HLE))) {
u64 data;
- if (!__kvm_get_msr(vcpu, MSR_IA32_TSX_CTRL, &data, true) &&
+ if (!kvm_get_msr(vcpu, MSR_IA32_TSX_CTRL, &data) &&
(data & TSX_CTRL_CPUID_CLEAR))
*ebx &= ~(F(RTM) | F(HLE));
}
On VMX, MSR_IA32_TSX_CTRL will be added to the so called shared MSR array
regardless of whether or not it is being advertised to userspace (this is
a bug in its own right). Using the host_initiated variant means KVM will
incorrectly bypass VMX's ARCH_CAP_TSX_CTRL_MSR check, i.e. incorrectly
clear the bits if userspace is being weird and stuffed MSR_IA32_TSX_CTRL
without advertising it to the guest.
In short, the whole MSR_IA32_TSX_CTRL implementation seems messy and this
is just papering over that mess. The correct fix is to invoke setup_msrs()
on writes to MSR_IA32_ARCH_CAPABILITIES, filtering MSR_IA32_TSX_CTRL out of
shared MSRs when it's not advertised, and change kvm_cpuid() to use the
unpriveleged variant.
TSC_CTRL aside, if we insist on pointing a gun at our foot at some point,
this should be a dedicated flavor of MSR access, e.g. msr_data.kvm_initiated,
so that it at least requires intentionally loading the gun.