Re: [PATCH 04/28] KVM: x86/mmu: Retry page fault if root is invalidated by memslot update
From: Ben Gardon
Date: Mon Nov 22 2021 - 14:54:57 EST
On Fri, Nov 19, 2021 at 8:51 PM Sean Christopherson <seanjc@xxxxxxxxxx> wrote:
>
> Bail from the page fault handler if the root shadow page was obsoleted by
> a memslot update. Do the check _after_ acuiring mmu_lock, as the TDP MMU
*acquiring
> doesn't rely on the memslot/MMU generation, and instead relies on the
> root being explicit marked invalid by kvm_mmu_zap_all_fast(), which takes
> mmu_lock for write.
>
> For the TDP MMU, inserting a SPTE into an obsolete root can leak a SP if
> kvm_tdp_mmu_zap_invalidated_roots() has already zapped the SP, i.e. has
> moved past the gfn associated with the SP.
>
> For other MMUs, the resulting behavior is far more convoluted, though
> unlikely to be truly problematic. Installing SPs/SPTEs into the obsolete
> root isn't directly problematic, as the obsolete root will be unloaded
> and dropped before the vCPU re-enters the guest. But because the legacy
> MMU tracks shadow pages by their role, any SP created by the fault can
> can be reused in the new post-reload root. Again, that _shouldn't_ be
> problematic as any leaf child SPTEs will be created for the current/valid
> memslot generation, and kvm_mmu_get_page() will not reuse child SPs from
> the old generation as they will be flagged as obsolete. But, given that
> continuing with the fault is pointess (the root will be unloaded), apply
> the check to all MMUs.
>
> Fixes: b7cccd397f31 ("KVM: x86/mmu: Fast invalidation for TDP MMU")
> Cc: stable@xxxxxxxxxxxxxxx
> Cc: Ben Gardon <bgardon@xxxxxxxxxx>
> Signed-off-by: Sean Christopherson <seanjc@xxxxxxxxxx>
A couple spelling nits, but otherwise:
Reviewed-by: Ben Gardon <bgardon@xxxxxxxxxx>
> ---
> arch/x86/kvm/mmu/mmu.c | 23 +++++++++++++++++++++--
> arch/x86/kvm/mmu/paging_tmpl.h | 3 ++-
> 2 files changed, 23 insertions(+), 3 deletions(-)
>
> diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
> index d2ad12a4d66e..31ce913efe37 100644
> --- a/arch/x86/kvm/mmu/mmu.c
> +++ b/arch/x86/kvm/mmu/mmu.c
> @@ -1936,7 +1936,11 @@ static void mmu_audit_disable(void) { }
>
> static bool is_obsolete_sp(struct kvm *kvm, struct kvm_mmu_page *sp)
> {
> - return sp->role.invalid ||
> + if (sp->role.invalid)
> + return true;
> +
> + /* TDP MMU pages due not use the MMU generation. */
*do
> + return !sp->tdp_mmu_page &&
> unlikely(sp->mmu_valid_gen != kvm->arch.mmu_valid_gen);
> }
>
> @@ -3976,6 +3980,20 @@ static bool kvm_faultin_pfn(struct kvm_vcpu *vcpu, struct kvm_page_fault *fault,
> return true;
> }
>
> +/*
> + * Returns true if the page fault is stale and needs to be retried, i.e. if the
> + * root was invalidated by a memslot update or a relevant mmu_notifier fired.
> + */
> +static bool is_page_fault_stale(struct kvm_vcpu *vcpu,
> + struct kvm_page_fault *fault, int mmu_seq)
> +{
> + if (is_obsolete_sp(vcpu->kvm, to_shadow_page(vcpu->arch.mmu->root_hpa)))
> + return true;
> +
> + return fault->slot &&
> + mmu_notifier_retry_hva(vcpu->kvm, mmu_seq, fault->hva);
> +}
> +
> static int direct_page_fault(struct kvm_vcpu *vcpu, struct kvm_page_fault *fault)
> {
> bool is_tdp_mmu_fault = is_tdp_mmu(vcpu->arch.mmu);
> @@ -4013,8 +4031,9 @@ static int direct_page_fault(struct kvm_vcpu *vcpu, struct kvm_page_fault *fault
> else
> write_lock(&vcpu->kvm->mmu_lock);
>
> - if (fault->slot && mmu_notifier_retry_hva(vcpu->kvm, mmu_seq, fault->hva))
> + if (is_page_fault_stale(vcpu, fault, mmu_seq))
> goto out_unlock;
> +
> r = make_mmu_pages_available(vcpu);
> if (r)
> goto out_unlock;
> diff --git a/arch/x86/kvm/mmu/paging_tmpl.h b/arch/x86/kvm/mmu/paging_tmpl.h
> index f87d36898c44..708a5d297fe1 100644
> --- a/arch/x86/kvm/mmu/paging_tmpl.h
> +++ b/arch/x86/kvm/mmu/paging_tmpl.h
> @@ -911,7 +911,8 @@ static int FNAME(page_fault)(struct kvm_vcpu *vcpu, struct kvm_page_fault *fault
>
> r = RET_PF_RETRY;
> write_lock(&vcpu->kvm->mmu_lock);
> - if (fault->slot && mmu_notifier_retry_hva(vcpu->kvm, mmu_seq, fault->hva))
> +
> + if (is_page_fault_stale(vcpu, fault, mmu_seq))
> goto out_unlock;
>
> kvm_mmu_audit(vcpu, AUDIT_PRE_PAGE_FAULT);
> --
> 2.34.0.rc2.393.gf8c9666880-goog
>