Re: [PATCH] KVM: x86: use again the flush argument of __link_shadow_page()

From: Sean Christopherson

Date: Mon May 04 2026 - 14:36:53 EST


On Mon, May 04, 2026, Sean Christopherson wrote:
> x86/mmu
>
> On Sun, May 03, 2026, Paolo Bonzini wrote:
> > Except in the case of parentless nested-TDP pages, mmu_page_zap_pte()
> > clears the SPTE but leaves the invalid_list empty. In this case, using
> > kvm_flush_remote_tlbs() as kvm_mmu_remote_flush_or_zap() does is overkill.
> > Avoid flushing the entirety of the remote TLBs unless the invalid_list
> > was populated: instead, use a more efficient gfn-targeting flush (if
> > available) and skip it altogether if the caller guarantees that a TLB
> > flush is not necessary.
> >
> > Based-on: <20260503201029.106481-1-pbonzini@xxxxxxxxxx>
> > Signed-off-by: Paolo Bonzini <pbonzini@xxxxxxxxxx>
> > ---
> > arch/x86/kvm/mmu/mmu.c | 6 ++++--
> > 1 file changed, 4 insertions(+), 2 deletions(-)
> >
> > diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
> > index 892246204435..85bec8eeace8 100644
> > --- a/arch/x86/kvm/mmu/mmu.c
> > +++ b/arch/x86/kvm/mmu/mmu.c
> > @@ -2541,8 +2541,10 @@ static void __link_shadow_page(struct kvm *kvm,
> > parent_sp = sptep_to_sp(sptep);
> > WARN_ON_ONCE(parent_sp->role.level == PG_LEVEL_4K);
> >
> > - mmu_page_zap_pte(kvm, parent_sp, sptep, &invalid_list);
> > - kvm_mmu_remote_flush_or_zap(kvm, &invalid_list, true);
> > + if (mmu_page_zap_pte(kvm, parent_sp, sptep, &invalid_list))
> > + kvm_mmu_commit_zap_page(kvm, &invalid_list);
> > + else if (flush)
> > + kvm_flush_remote_tlbs_sptep(kvm, sptep);
>
> Duh, this is obvious in hindsight.
>
> Reviewed-by: Sean Christopherson <seanjc@xxxxxxxxxx>

An amendment to that: I thought this was just switching back to the more targeted
range-based flushed, I didn't realize you applied the version that hardcoded the
@flush param to kvm_mmu_remote_flush_or_zap() with "true".

If you take this through kvm.git directly, can you add this comment?

/*
* Note! @flush from the caller doesn't follow KVM's standard
* "collect TLB flushes in a variable to batch them" pattern.
* In this case, @flush is used to communicate whether or not a
* TLB flush is needed *now*, and specifically only impacts the
* case where a huge SPTE is replaced with a shadow page SPTE
* (KVM always flushes if a shadow page SPTE is zapped).
*
* When splitting a hugepage and the new shadow page is fully
* populated, i.e. every child SPTE is shadow-present and thus
* the total mappings are functionally identical, KVM can defer
* the TLB flush (until the ioctl completes) as no memory has
* been unmapped, and all mappings are still reachable, e.g. so
* that future mmu_notifier invalidations are guaranteed to
* flush the affected range if relevant mappings are zapped.
*/

If you're expecting me to grab this, I'll add the comment when applying.