Re: [PATCH 07/17] KVM: x86/tdp_mmu: Centralize updates to present external PTEs
From: Edgecombe, Rick P
Date: Wed Apr 01 2026 - 19:46:20 EST
On Mon, 2026-03-30 at 14:14 +0800, Yan Zhao wrote:
> > u64 new_spte)
> > {
> > + struct kvm_mmu_page *sp = sptep_to_sp(rcu_dereference(iter-
> > >sptep));
> > int ret;
> >
> > lockdep_assert_held_read(&kvm->mmu_lock);
> >
> > - ret = __tdp_mmu_set_spte_atomic(kvm, iter, new_spte);
> > + /* KVM should never freeze SPTEs using higher level APIs. */
> "higher level API" is ambiguous. e.g. kvm_tdp_mmu_write_spte_atomic() allows
> new_spte to be FROZEN_SPTE.
Yea you are right. It felt too fuzzy but I couldn't think of a better word.
>
> What about just "callers of tdp_mmu_set_spte_atomic() should not freeze SPTEs
> directly"?
Sure.
>
> > + KVM_MMU_WARN_ON(is_frozen_spte(new_spte));
> > +
> > + /*
> > + * Temporarily freeze the SPTE until the external PTE operation has
> > + * completed (unless the new SPTE itself will be frozen), e.g. so
> > that
> > + * concurrent faults don't attempt to install a child PTE in the
> > + * external page table before the parent PTE has been written, or
> > try
> > + * to re-install a page table before the old one was removed.
> > + */
> > + if (is_mirror_sptep(iter->sptep))
> > + ret = __tdp_mmu_set_spte_atomic(kvm, iter, FROZEN_SPTE);
> > + else
> > + ret = __tdp_mmu_set_spte_atomic(kvm, iter, new_spte);
> > if (ret)
> > return ret;
> >
> > - handle_changed_spte(kvm, iter->as_id, iter->gfn, iter->old_spte,
> > - new_spte, iter->level, true);
> > + ret = __handle_changed_spte(kvm, sp, iter->gfn, iter->old_spte,
> > + new_spte, iter->level, true);
>
> What about adding a comment for the tricky part for the mirror page table:
> while new_spte is set to FROZEN_SPTE in the above __tdp_mmu_set_spte_atomic()
You meant it sets iter->sptep I think.
> for freezing the mirror page table, the original new_spte from the caller of
> tdp_mmu_set_spte_atomic() is passed to __handle_changed_spte() in order to
> properly update statistics and propagate to the external page table.
new_spte was already passed in. What changed? You mean that
__tdp_mmu_set_spte_atomic() sets iter->sptep and doesn't update new_spte? If so
I'm not sure if it threshold TDP MMU.
>
> > - return 0;
> > + /*
> > + * Unfreeze the mirror SPTE. If updating the external SPTE failed,
> > + * restore the old SPTE so that the SPTE isn't frozen in
> > perpetuity,
> > + * otherwise set the mirror SPTE to the new desired value.
> > + */
> > + if (is_mirror_sptep(iter->sptep)) {
> > + if (ret)
> > + __kvm_tdp_mmu_write_spte(iter->sptep, iter-
> > >old_spte);
> > + else
> > + __kvm_tdp_mmu_write_spte(iter->sptep, new_spte);
> > + } else {
> > + /*
> > + * Bug the VM if handling the change failed, as failure is
> > only
> > + * allowed if KVM couldn't update the external SPTE.
> > + */
> > + KVM_BUG_ON(ret, kvm);
> > + }
> > + return ret;
> > }
> >
> > /*
> > @@ -738,6 +759,8 @@ static inline int __must_check
> > tdp_mmu_set_spte_atomic(struct kvm *kvm,
> > static u64 tdp_mmu_set_spte(struct kvm *kvm, int as_id, tdp_ptep_t sptep,
> > u64 old_spte, u64 new_spte, gfn_t gfn, int
> > level)
> > {
> > + struct kvm_mmu_page *sp = sptep_to_sp(rcu_dereference(sptep));
> > +
> > lockdep_assert_held_write(&kvm->mmu_lock);
> >
> > /*
> > @@ -751,7 +774,7 @@ static u64 tdp_mmu_set_spte(struct kvm *kvm, int as_id,
> > tdp_ptep_t sptep,
> >
> > old_spte = kvm_tdp_mmu_write_spte(sptep, old_spte, new_spte,
> > level);
> >
> > - handle_changed_spte(kvm, as_id, gfn, old_spte, new_spte, level,
> > false);
> > + handle_changed_spte(kvm, sp, gfn, old_spte, new_spte, level,
> > false);
> >
> > /*
> > * Users that do non-atomic setting of PTEs don't operate on mirror
> > @@ -1373,6 +1396,9 @@ static void kvm_tdp_mmu_age_spte(struct kvm *kvm,
> > struct tdp_iter *iter)
> > {
> > u64 new_spte;
> >
> > + if (WARN_ON_ONCE(is_mirror_sptep(iter->sptep)))
> > + return;
> > +
> Add a comment for why mirror page table is not expected here?
Ehh, maybe. Thinking about what to put... The warning is kind of cheating a
little bit on the idea of the patch: to forward all changes through limited ops
in a central place, such that we don't have TDX specifics encoded in core MMU.
Trying to forward this through properly would result in more burden to the TDP
MMU, so that's not the right answer either.
"Mirror TDP doesn't support PTE aging" is a pretty obvious comment. I'm fine
just leaving it without comment, but I can add something like that. Or do you
have another suggestion?
>
> And do we need a similar WARN_ON_ONCE() in kvm_tdp_mmu_clear_dirty_pt_masked()
> or clear_dirty_pt_masked()?
Nothing changes for those in this patch though? For the kvm_tdp_mmu_age_spte()
case, warning coverage is removed in this patch.
>
> > if (spte_ad_enabled(iter->old_spte)) {
> > iter->old_spte = tdp_mmu_clear_spte_bits_atomic(iter-
> > >sptep,
> > shadow_acce
> > ssed_mask);
> > --
> > 2.53.0