Re: [PATCH 1/2] KVM: x86/mmu: Use vCPU's APICv status when handling APIC_ACCESS memslot

From: Maxim Levitsky
Date: Sun Oct 10 2021 - 08:47:16 EST


On Fri, 2021-10-08 at 18:01 -0700, Sean Christopherson wrote:
> Query the vCPU's APICv status, not the overall VM's status, when handling
> a page fault that hit the APIC Access Page memslot. If an APICv status
> update is pending, using the VM's status is non-deterministic as the
> initiating vCPU may or may not have updated overall VM's status. E.g. if
> a vCPU hits an APIC Access page fault with APICv disabled and a different
> vCPU is simultaneously performing an APICv update, the page fault handler
> will incorrectly skip the special APIC access page MMIO handling.
>
> Using the vCPU's status in the page fault handler is correct regardless
> of any pending APICv updates, as the vCPU's status is accurate with
> respect to the last VM-Enter, and thus reflects the context in which the
> page fault occurred.

Actually I don't think that this patch is correct, and the current code is correct.

- The page fault can happen if one of the following is true:

- AVIC is currently inhibited.

- AVIC is currently inhibited but is in the process of being uninhibited.

- AVIC is not inhibited but has never been accessed by a VCPU after it was uninihibited.

This will *usually* cause this code to populate the corresponding SPTE entry and re-enter the guest which
will make the AVIC work on instruction re-execution without a page fault.

It depends if the page fault code sees new or old value of the global inhibition state, which is not possible
to avoid, as the page fault can happen anytime.

If the code doesn't populate the SPTE entry, the access will be emulated (which is correct too, and next access
will page fault again and that fault will re-install the SPTE.


Note that AVIC's SPTE is *VM global*, just like all other SPTEs.

- The decision is here to poplute the SPTE and retry or just emulate the APIC read/write without populating it.

Since AVIC read/writes the same apic register page, reading it now, or populating the SPTE, enabling AVIC and letting the AVIC read/write it should read/write the same values.

Thus the real decision here is if to populate the SPTE or not.

- If AVIC is currently inhibited on this VCPU, but global AVIC inhibit is already OFF, we do want
to populute the SPTE, and prior to guest entry we will update the vCPU inhibit state to disable inhibition on this VCPU.

So its the global AVIC inhibit state, is what is correct to use for this decision IMHO.

Best regards,
Maxim Levitsky


>
> Cc: Maxim Levitsky <mlevitsk@xxxxxxxxxx>
> Fixes: 9cc13d60ba6b ("KVM: x86/mmu: allow APICv memslot to be enabled but invisible")
> Signed-off-by: Sean Christopherson <seanjc@xxxxxxxxxx>
> ---
> arch/x86/kvm/mmu/mmu.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
> index 24a9f4c3f5e7..d36e205b90a5 100644
> --- a/arch/x86/kvm/mmu/mmu.c
> +++ b/arch/x86/kvm/mmu/mmu.c
> @@ -3853,7 +3853,7 @@ static bool kvm_faultin_pfn(struct kvm_vcpu *vcpu, struct kvm_page_fault *fault,
> * when the AVIC is re-enabled.
> */
> if (slot && slot->id == APIC_ACCESS_PAGE_PRIVATE_MEMSLOT &&
> - !kvm_apicv_activated(vcpu->kvm)) {
> + !kvm_vcpu_apicv_active(vcpu)) {
> *r = RET_PF_EMULATE;
> return true;
> }