Re: [PATCH v3 04/28] KVM: x86/mmu: Formalize TDP MMU's (unintended?) deferred TLB flush logic

From: Sean Christopherson
Date: Thu Mar 03 2022 - 11:45:19 EST


On Thu, Mar 03, 2022, Mingwei Zhang wrote:
> On Thu, Mar 03, 2022, Sean Christopherson wrote:
> > Heh, if SVM's ASID management wasn't a mess[*], it'd be totally fine. The idea,
> > and what EPT architectures mandates, is that each TDP root is associated with an
> > ASID. So even though there may be stale entries in the TLB for a root, because
> > that root is no longer used those stale entries are unreachable. And if KVM ever
> > happens to reallocate the same physical page for a root, that's ok because KVM must
> > be paranoid and flush that root (see code comment in this patch).
> >
> > What we're missing on SVM is proper ASID handling. If KVM uses ASIDs the way AMD
> > intends them to be used, then this works as intended because each root is again
> > associated with a specific ASID, and KVM just needs to flush when (re)allocating
> > a root and when reusing an ASID (which it already handles).
> >
> > [*] https://lore.kernel.org/all/Yh%2FJdHphCLOm4evG@xxxxxxxxxx
>
> Oh, putting AMD issues aside for now.
>
> I think I might be too narrow down to the zapping logic previously. So,
> I originally think anytime we want to zap, we have to do the following
> things in strict order:
>
> 1) zap SPTEs.
> 2) flush TLBs.
> 3) flush cache (AMD SEV only).
> 4) deallocate shadow pages.

Not necessarily. 1-3 are actually all optional. E.g. for #1, if KVM somehow
knew that the host didn't care about A/D bits (no writeback needed, no LRU info
needed), then KVM could skip straight to freeing the shadow pages when destroying
a VM.

Flushing the TLB before freeing pages is optional because KVM only needs to ensure
the guest can no longer access the memory. E.g. at kvm_mmu_notifier_release(),
because KVM disallows KVM_RUN from a different mm, KVM knows that the guest will
never run again and so can skip the TLB flushes.

For the TLB, that does mean KVM needs to flush when using an ASID/EPT4 for the
first time, but KVM needs to do that regardless to guard against a different
hypervisor being loaded previously (where a "different" hypervisor could very
well be an older, buggier version of KVM).

> However, if you have already invalidated EPTP (pgd ptr), then step 2)
> becomes optional, since those stale TLBs are no longer useable by the
> guest due to the change of ASID.

Mostly. It doesn't require an "invalidated EPTP", just a different EPT4A (or
ASID on SVM).

> Am I understanding the point correctly? So, for all invalidated roots,
> the assumption is that we have already called "kvm_reload_rmote_mmus()",
> which basically update the ASID.

No, the assumption (though I'd describe it a requirement) is that vCPUs can no
longer consume the TLB entries. That could be due to a reload, but as above it
could also be due to KVM knowing KVM_RUN is unreachable.