Re: [PATCH v3 01/37] KVM: VMX: Flush all EPTP/VPID contexts on remote TLB flush

From: Lai Jiangshan
Date: Tue Aug 03 2021 - 23:12:44 EST

On Tue, Aug 3, 2021 at 11:39 PM Sean Christopherson <seanjc@xxxxxxxxxx> wrote:
> On Tue, Aug 03, 2021, Lai Jiangshan wrote:
> > (I'm replying to a very old email, so many CCs are dropped.)
> >
> > On Sat, Mar 21, 2020 at 5:33 AM Sean Christopherson
> > <sean.j.christopherson@xxxxxxxxx> wrote:
> > >
> > > Flush all EPTP/VPID contexts if a TLB flush _may_ have been triggered by
> > > a remote or deferred TLB flush, i.e. by KVM_REQ_TLB_FLUSH. Remote TLB
> > > flushes require all contexts to be invalidated, not just the active
> > > contexts, e.g. all mappings in all contexts for a given HVA need to be
> > > invalidated on a mmu_notifier invalidation. Similarly, the instigator
> > > of the deferred TLB flush may be expecting all contexts to be flushed,
> > > e.g. vmx_vcpu_load_vmcs().
> > >
> > > Without nested VMX, flushing only the current EPTP/VPID context isn't
> > > problematic because KVM uses a constant VPID for each vCPU, and
> >
> > Hello, Sean
> >
> > Is the patch optimized for cases where nested VMX is active?
> Well, this patch isn't, but KVM has since been optimized to do full EPT/VPID
> flushes only when "necessary". Necessary in quotes because the two uses can
> technically be further optimized, but doing so would incur significant complexity.

Hello, thanks for your reply.

I know there might be a lot of possible optimizations to be considered, many of
which are too complicated to be implemented.

The optimization I considered yesterday is "ept_sync_global() V.S.
ept_sync_context(this_vcpu's)" in the case: when the VM is using EPT and
doesn't allow nested VMs. (And I failed to express it yesterday)

In this case, the vCPU uses only one single root_hpa, and I think ept sync
for single context is enough for both cases you listed below.

When the context is flushed, the TLB for the vCPU is clean to run.

If kvm changes the mmu->root_hpa, it is kvm's responsibility to request
another flush which is implemented.

In other words, KVM_REQ_TLB_FLUSH == KVM_REQ_TLB_FLUSH_CURRENT in this case.
And before this patch, kvm flush only the single context rather than global.

> Use #1 is remote flushes from the MMU, which don't strictly require a global flush,
> but KVM would need to propagate more information (mmu_role?) in order for responding
> vCPUs to determine what contexts needs to be flushed. And practically speaking,
> for MMU flushes there's no meaningful difference when using TDP without nested
> guests as the common case will be that each vCPU has a single active EPTP and
> that EPTP will be affected by the MMU changes, i.e. needs to be flushed.

I don't see when we need "to determine what contexts" since the vcpu is
using only one context in this case which is the assumption in my mind,
could you please correct me if I'm wrong.


> Use #2 is in VMX's pCPU migration path. Again, not strictly necessary as KVM could
> theoretically track which pCPUs have run a particular vCPU and when that pCPU last
> flushed EPT contexts, but fully solving the problem would be quite complex. Since
> pCPU migration is always going to be a slow path, the extra complexity would be
> very difficult to justify.
> > I think the non-nested cases are normal cases.
> >
> > Although the related code has been changed, the logic of the patch
> > is still working now, would it be better if we restore the optimization
> > for the normal cases (non-nested)?
> As above, vmx_flush_tlb_all() hasn't changed, but the callers have.