Re: [PATCH v2 31/32] KVM: nVMX: Don't flush TLB on nested VM transition with EPT enabled

From: Sean Christopherson
Date: Wed Mar 18 2020 - 13:02:50 EST


On Wed, Mar 18, 2020 at 11:36:04AM +0100, Paolo Bonzini wrote:
> On 17/03/20 19:22, Sean Christopherson wrote:
> > On Tue, Mar 17, 2020 at 06:18:37PM +0100, Paolo Bonzini wrote:
> >> On 17/03/20 05:52, Sean Christopherson wrote:
> >>> diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
> >>> index d816f1366943..a77eab5b0e8a 100644
> >>> --- a/arch/x86/kvm/vmx/nested.c
> >>> +++ b/arch/x86/kvm/vmx/nested.c
> >>> @@ -1123,7 +1123,7 @@ static int nested_vmx_load_cr3(struct kvm_vcpu *vcpu, unsigned long cr3, bool ne
> >>> }
> >>>
> >>> if (!nested_ept)
> >>> - kvm_mmu_new_cr3(vcpu, cr3, false);
> >>> + kvm_mmu_new_cr3(vcpu, cr3, enable_ept);
> >>
> >> Even if enable_ept == false, we could have already scheduled or flushed
> >> the TLB soon due to one of 1) nested_vmx_transition_tlb_flush 2)
> >> vpid_sync_context in prepare_vmcs02 3) the processor doing it for
> >> !enable_vpid.
> >>
> >> So for !enable_ept only KVM_REQ_MMU_SYNC is needed, not
> >> KVM_REQ_TLB_FLUSH_CURRENT I think. Worth adding a TODO?
> >
> > Now that you point it out, I think it makes sense to unconditionally pass
> > %true here, i.e. rely 100% on nested_vmx_transition_tlb_flush() to do the
> > right thing.
>
> Why doesn't it need KVM_REQ_MMU_SYNC either?

Hmm, so if L1 is using VPID, we're ok without a sync. Junaid's INVVPID
patch earlier in this series ensures cached roots won't retain unsync'd
SPTEs when L1 does INVVPID. If L1 doesn't flush L2's TLB on VM-Entry, it
can't expect L2 to recognize changes in the PTEs since the last INVVPID.

Per Intel's SDM, INVLPG (and INVPCID) are only required to invalidate
entries for the current VPID, i.e. virtual VPID=0 when executed by L1.

Operations that architecturally invalidate entries in the TLBs or
paging-structure caches independent of VMX operation (e.g., the INVLPG and
INVPCID instructions) invalidate linear mappings and combined mappings.
They are required to do so only for the current VPID.
^^^^^^^^^^^^^^^^^^^^^^^^^

If L1 isn't using VPID and L0 isn't using EPT, then a sync is required as
L1 would expect PTE changes to be recognized without an explicit INVLPG
prior to VM-Ennter.

So something like this?

if (!nested_ept)
kvm_mmu_new_cr3(vcpu, cr3, enable_ept ||
nested_cpu_has_vpid(vmcs12));

The KVM_REQ_TLB_FLUSH_CURRENT request would be redundant with
nested_vmx_transition_tlb_flush() when VPID is enabled, and is a (big) nop
when VPID is disabled. In either case the overhead is negligible. Ideally
this logic would tie into nested_vmx_transition_tlb_flush() in some way,
but making that happen may be wishful thinking.

> All this should be in a comment as well, of course.

Heh, in hindsight that's painfully obvious.