Re: [PATCH v2 04/23] KVM: x86: Inhibit AVIC SPTEs if any vCPU enables x2APIC

From: Sean Christopherson
Date: Tue Sep 20 2022 - 12:51:02 EST


On Tue, Sep 20, 2022, Sean Christopherson wrote:
> On Tue, Sep 20, 2022, Maxim Levitsky wrote:
> > On Sat, 2022-09-03 at 00:22 +0000, Sean Christopherson wrote:
> > > Reintroduce APICV_INHIBIT_REASON_X2APIC as a "partial" inhibit for AMD
> > > to fix a bug where the APIC access page is visible to vCPUs that have
> > > x2APIC enabled, i.e. shouldn't be able to "see" the xAPIC MMIO region.
> > >
> > > On AMD, due to its "hybrid" mode where AVIC is enabled when x2APIC is
> > > enabled even without x2AVIC support, the bug occurs any time AVIC is
> > > enabled as x2APIC is fully emulated by KVM.  I.e. hardware isn't aware
> > > that the guest is operating in x2APIC mode.
> > >
> > > Opportunistically drop the "can" while updating avic_activate_vmcb()'s
> > > comment, i.e. to state that KVM _does_ support the hybrid mode.  Move
> > > the "Note:" down a line to conform to preferred kernel/KVM multi-line
> > > comment style.
> > >
> > > Leave Intel as-is for now to avoid a subtle performance regression, even
> > > though Intel likely suffers from the same bug.  On Intel, in theory the
> > > bug rears its head only when vCPUs share host page tables (extremely
> > > likely) and x2APIC enabling is not consistent within the guest, i.e. if
> > > some vCPUs have x2APIC enabled and other does do not (unlikely to occur
> > > except in certain situations, e.g. bringing up APs).
> >
> > Are you sure about this?
>
> Ah, no. The key on Intel is the separate VMCS control for virtualizing xAPIC
> accesses. As you note below, KVM will provide memory semantics, which is technically
> wrong.
>
> > This is what I am thinking will happen, I might be wrong but I am not sure:
>
> ...
>
> > 3. guest accesses the 0xfee00xxx, assuming APICv/x2avic, the hardware won't redirect
> > the access to apic backing page, but will instead just use that SPTE and let the guest
> > read/write the private page that we mapped in the range, which is wrong.
> >
> > Am I missing something?
>
> No, I don't believe so. I'm still hesitant to add the effetive inhibit to Intel,
> though that's probably just pure paranoia at this point. Probably makes sense to
> just do it and verify that x2APIC virtualization still works.

Scratch that, Intel can end up with memory semantics irrespective of x2APIC, e.g.
if APICv is enabled but a vCPU disables its APIC. We could force a bus error by
inhibiting APICv if any vCPU has its APIC hardware disabled, but IMO that's a bad
tradeoff as there are legitimate reasons to disable the APIC on select vCPUs.

So, I'll omit Intel from the x2APIC pseudo-inhibit, and maybe add a KVM erratum
to document that KVM may provide memory semantics for APIC_BASE when APICv/AVIC
is enabled.