Re: [PATCH 5/5] KVM: VMX: Always honor guest PAT on CPUs that support self-snoop
From: Sean Christopherson
Date: Mon Apr 01 2024 - 18:29:54 EST
On Mon, Mar 25, 2024, Chao Gao wrote:
> On Fri, Mar 08, 2024 at 05:09:29PM -0800, Sean Christopherson wrote:
> >Unconditionally honor guest PAT on CPUs that support self-snoop, as
> >Intel has confirmed that CPUs that support self-snoop always snoop caches
> >and store buffers. I.e. CPUs with self-snoop maintain cache coherency
> >even in the presence of aliased memtypes, thus there is no need to trust
> >the guest behaves and only honor PAT as a last resort, as KVM does today.
> >
> >Honoring guest PAT is desirable for use cases where the guest has access
> >to non-coherent DMA _without_ bouncing through VFIO, e.g. when a virtual
> >(mediated, for all intents and purposes) GPU is exposed to the guest, along
> >with buffers that are consumed directly by the physical GPU, i.e. which
> >can't be proxied by the host to ensure writes from the guest are performed
> >with the correct memory type for the GPU.
..
> > int kvm_tdp_page_fault(struct kvm_vcpu *vcpu, struct kvm_page_fault *fault)
> >diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
> >index 17a8e4fdf9c4..5dc4c24ae203 100644
> >--- a/arch/x86/kvm/vmx/vmx.c
> >+++ b/arch/x86/kvm/vmx/vmx.c
> >@@ -7605,11 +7605,13 @@ static u8 vmx_get_mt_mask(struct kvm_vcpu *vcpu, gfn_t gfn, bool is_mmio)
> >
> > /*
> > * Force WB and ignore guest PAT if the VM does NOT have a non-coherent
> >- * device attached. Letting the guest control memory types on Intel
> >- * CPUs may result in unexpected behavior, and so KVM's ABI is to trust
> >- * the guest to behave only as a last resort.
> >+ * device attached and the CPU doesn't support self-snoop. Letting the
> >+ * guest control memory types on Intel CPUs without self-snoop may
> >+ * result in unexpected behavior, and so KVM's (historical) ABI is to
> >+ * trust the guest to behave only as a last resort.
> > */
> >- if (!kvm_arch_has_noncoherent_dma(vcpu->kvm))
> >+ if (!static_cpu_has(X86_FEATURE_SELFSNOOP) &&
> >+ !kvm_arch_has_noncoherent_dma(vcpu->kvm))
> > return (MTRR_TYPE_WRBACK << VMX_EPT_MT_EPTE_SHIFT) | VMX_EPT_IPAT_BIT;
>
> W/ this change, guests w/o pass-thru devices can also access UC memory. Locking
> UC memory leads to bus lock. So, guests w/o pass-thru devices can potentially
> launch DOS attacks on other CPUs on host. isn't it a problem?
Guests can already trigger bus locks with atomic accesses that split cache lines.
And SPR adds bus lock detection. So practically speaking, I'm pretty sure ICX is
the only CPU where anything close to a novel attack is possible. And FWIW, such
an attack is already possible on AMD.