Re: [PATCH v2 04/23] KVM: x86: Inhibit AVIC SPTEs if any vCPU enables x2APIC

From: Maxim Levitsky
Date: Fri Sep 23 2022 - 06:20:06 EST


On Tue, 2022-09-20 at 15:46 +0000, Sean Christopherson wrote:
> On Tue, Sep 20, 2022, Maxim Levitsky wrote:
> > On Sat, 2022-09-03 at 00:22 +0000, Sean Christopherson wrote:
> > > Reintroduce APICV_INHIBIT_REASON_X2APIC as a "partial" inhibit for AMD
> > > to fix a bug where the APIC access page is visible to vCPUs that have
> > > x2APIC enabled, i.e. shouldn't be able to "see" the xAPIC MMIO region.
> > >
> > > On AMD, due to its "hybrid" mode where AVIC is enabled when x2APIC is
> > > enabled even without x2AVIC support, the bug occurs any time AVIC is
> > > enabled as x2APIC is fully emulated by KVM. I.e. hardware isn't aware
> > > that the guest is operating in x2APIC mode.
> > >
> > > Opportunistically drop the "can" while updating avic_activate_vmcb()'s
> > > comment, i.e. to state that KVM _does_ support the hybrid mode. Move
> > > the "Note:" down a line to conform to preferred kernel/KVM multi-line
> > > comment style.
> > >
> > > Leave Intel as-is for now to avoid a subtle performance regression, even
> > > though Intel likely suffers from the same bug. On Intel, in theory the
> > > bug rears its head only when vCPUs share host page tables (extremely
> > > likely) and x2APIC enabling is not consistent within the guest, i.e. if
> > > some vCPUs have x2APIC enabled and other does do not (unlikely to occur
> > > except in certain situations, e.g. bringing up APs).
> >
> > Are you sure about this?
>
> Ah, no. The key on Intel is the separate VMCS control for virtualizing xAPIC
> accesses. As you note below, KVM will provide memory semantics, which is technically
> wrong.
>
> > This is what I am thinking will happen, I might be wrong but I am not sure:
>
> ...
>
> > 3. guest accesses the 0xfee00xxx, assuming APICv/x2avic, the hardware won't redirect
> > the access to apic backing page, but will instead just use that SPTE and let the guest
> > read/write the private page that we mapped in the range, which is wrong.
> >
> > Am I missing something?
>
> No, I don't believe so. I'm still hesitant to add the effetive inhibit to Intel,
> though that's probably just pure paranoia at this point. Probably makes sense to
> just do it and verify that x2APIC virtualization still works.
>
> > Also I somewhat doen't like the partial inhibit - it is to some extent
> > misleading. I don't have a very strong option on using the inhibit, but its
> > meaning just feels a bit overloaded.
> >
> > So why not to do it this way:
> >
> > 1. zap the SPTE always when switching apic mode (e.g move the code in
> > __kvm_set_or_clear_apicv_inhibit to a common funtion)
> >
> > 2. make kvm_faultin_pfn check a flag 'any vcpu enabled x2apic' and refuse
> > to re-install that spte?
> >
> > Something like that (only compile tested, and likely misses memory barriers):
>
> Actually, since this is "sticky", we can go even further and just delete the
> memslot. Deleting the memslot is slightly complicated by the need to drop SRCU
> if kvm_lapic_set_base() enables x2APIC during KVM_RUN, but that's enough enough
> to handled by putting the disabling logic in vcpu_run() and using KVM_REQ_UNBLOCK
> to ensure the memslot is deleted before the vCPU re-enters the guest.

Yes, that is the elephant in the room - deleting the memslot makes all of the sense,
and I thought about doing it, except that it has a chance of letting
the genie out of its bottle again - remember that mess we had with the fact that
the memslots are rcu protected?

If it works, I 100% support the idea.

Also I think you want to remove the KVM_REQ_UNBLOCK, in the other patch series
you just posted?

Best regards,
Maxim Levitsky

>
> Testing now...
>