and after spending quite some time I wonder if all this should just be
if (refcount_dec_not_one(&root->tdp_mmu_root_count))
return;
if (!xchg(&root->role.invalid, true) {
The refcount being '1' means there's another task currently using root, marking
the root invalid will mean checks on the root's validity are non-deterministic
for the other task.
tdp_mmu_zap_root(kvm, root, shared);
/*
* Do not assume the refcount is still 1: because
* tdp_mmu_zap_root can yield, a different task
* might have grabbed a reference to this root.
*
if (refcount_dec_not_one(&root->tdp_mmu_root_count))
This is wrong, _this_ task can't drop a reference taken by the other task.
return;
}
/*
* The root is invalid, and its reference count has reached
* zero. It must have been zapped either in the "if" above or
* by someone else, and we're definitely the last thread to see
* it apart from RCU-protected page table walks.
*/
refcount_set(&root->tdp_mmu_root_count, 0);
Not sure what you intended here, KVM should never force a refcount to '0'.
xchg() is a very good idea. The smp_mb_*() stuff was carried over from the previous
version where this sequence set another flag in addition to role.invalid.
Is this less funky (untested)?
/*
* Invalidate the root to prevent it from being reused by a vCPU while
* the root is being zapped, i.e. to allow yielding while zapping the
* root (see below).
*
* Free the root if it's already invalid. Invalid roots must be zapped
* before their last reference is put, i.e. there's no work to be done,
* and all roots must be invalidated before they're freed (this code).
* Re-zapping invalid roots would put KVM into an infinite loop.
*
* Note, xchg() provides an implicit barrier to ensure role.invalid is
* visible if a concurrent reader acquires a reference after the root's
* refcount is reset.
*/
if (xchg(root->role.invalid, true))
spin_lock(&kvm->arch.tdp_mmu_pages_lock);
list_del_rcu(&root->link);
spin_unlock(&kvm->arch.tdp_mmu_pages_lock);
call_rcu(&root->rcu_head, tdp_mmu_free_sp_rcu_callback);
return;
}