Re: [PATCH v2 04/23] KVM: x86: Inhibit AVIC SPTEs if any vCPU enables x2APIC

From: Sean Christopherson
Date: Tue Sep 20 2022 - 11:47:09 EST


On Tue, Sep 20, 2022, Maxim Levitsky wrote:
> On Sat, 2022-09-03 at 00:22 +0000, Sean Christopherson wrote:
> > Reintroduce APICV_INHIBIT_REASON_X2APIC as a "partial" inhibit for AMD
> > to fix a bug where the APIC access page is visible to vCPUs that have
> > x2APIC enabled, i.e. shouldn't be able to "see" the xAPIC MMIO region.
> >
> > On AMD, due to its "hybrid" mode where AVIC is enabled when x2APIC is
> > enabled even without x2AVIC support, the bug occurs any time AVIC is
> > enabled as x2APIC is fully emulated by KVM.  I.e. hardware isn't aware
> > that the guest is operating in x2APIC mode.
> >
> > Opportunistically drop the "can" while updating avic_activate_vmcb()'s
> > comment, i.e. to state that KVM _does_ support the hybrid mode.  Move
> > the "Note:" down a line to conform to preferred kernel/KVM multi-line
> > comment style.
> >
> > Leave Intel as-is for now to avoid a subtle performance regression, even
> > though Intel likely suffers from the same bug.  On Intel, in theory the
> > bug rears its head only when vCPUs share host page tables (extremely
> > likely) and x2APIC enabling is not consistent within the guest, i.e. if
> > some vCPUs have x2APIC enabled and other does do not (unlikely to occur
> > except in certain situations, e.g. bringing up APs).
>
> Are you sure about this?

Ah, no. The key on Intel is the separate VMCS control for virtualizing xAPIC
accesses. As you note below, KVM will provide memory semantics, which is technically
wrong.

> This is what I am thinking will happen, I might be wrong but I am not sure:

...

> 3. guest accesses the 0xfee00xxx, assuming APICv/x2avic, the hardware won't redirect
> the access to apic backing page, but will instead just use that SPTE and let the guest
> read/write the private page that we mapped in the range, which is wrong.
>
> Am I missing something?

No, I don't believe so. I'm still hesitant to add the effetive inhibit to Intel,
though that's probably just pure paranoia at this point. Probably makes sense to
just do it and verify that x2APIC virtualization still works.

> Also I somewhat doen't like the partial inhibit - it is to some extent
> misleading. I don't have a very strong option on using the inhibit, but its
> meaning just feels a bit overloaded.
>
> So why not to do it this way:
>
> 1. zap the SPTE always when switching apic mode (e.g move the code in
> __kvm_set_or_clear_apicv_inhibit to a common funtion)
>
> 2. make kvm_faultin_pfn check a flag 'any vcpu enabled x2apic' and refuse
> to re-install that spte?
>
> Something like that (only compile tested, and likely misses memory barriers):

Actually, since this is "sticky", we can go even further and just delete the
memslot. Deleting the memslot is slightly complicated by the need to drop SRCU
if kvm_lapic_set_base() enables x2APIC during KVM_RUN, but that's enough enough
to handled by putting the disabling logic in vcpu_run() and using KVM_REQ_UNBLOCK
to ensure the memslot is deleted before the vCPU re-enters the guest.

Testing now...