Re: [RFC PATCH v5 02/45] KVM: x86/mmu: Update iter->old_spte if cmpxchg64 on mirror SPTE "fails"

From: Sean Christopherson

Date: Thu Jan 29 2026 - 17:24:17 EST


On Thu, Jan 29, 2026, Rick P Edgecombe wrote:
> On Wed, 2026-01-28 at 17:14 -0800, Sean Christopherson wrote:
> > Pass a pointer to iter->old_spte, not simply its value, when setting an
> > external SPTE in __tdp_mmu_set_spte_atomic(), so that the iterator's value
> > will be updated if the cmpxchg64 to freeze the mirror SPTE fails.
> >
>
> Might be being dense here, but is the bug that if cmpxchg64 *succeeds* and
> set_external_spte() fails? Then old_spte is not updated and the local retry will
> expect the wrong old_spte.

No, the bug is if the cmpxchg64 fails. On failure, the current mismatching value
is stored in the "old" param. KVM relies on the iter->old_spte holding the
current value when restarting an operation without re-reading the SPTE from memory.

E.g. in __tdp_mmu_zap_root(), if tdp_mmu_set_spte_atomic() fails, iter->old_spte
*must* hold the current in-memroy value, otherwise the loop will hang because it
will re-attempt cmpxchg64 using the stale iter->old_spte.

static void __tdp_mmu_zap_root(struct kvm *kvm, struct kvm_mmu_page *root,
bool shared, int zap_level)
{
struct tdp_iter iter;

for_each_tdp_pte_min_level_all(iter, root, zap_level) {
retry:
if (tdp_mmu_iter_cond_resched(kvm, &iter, false, shared))
continue;

if (!is_shadow_present_pte(iter.old_spte))
continue;

if (iter.level > zap_level)
continue;

if (!shared)
tdp_mmu_iter_set_spte(kvm, &iter, SHADOW_NONPRESENT_VALUE);
else if (tdp_mmu_set_spte_atomic(kvm, &iter, SHADOW_NONPRESENT_VALUE))
goto retry;
}
}

> >   The bug
> > is currently benign as TDX is mutualy exclusive with all paths that do
> > "local" retry", e.g. clear_dirty_gfn_range() and wrprot_gfn_range().
>
>