Re: [PATCH v2 1/7] KVM: x86/mmu: Track if shadow MMU active

From: Paolo Bonzini
Date: Tue May 04 2021 - 16:18:55 EST

On 04/05/21 19:26, Ben Gardon wrote:
On Mon, May 3, 2021 at 6:42 AM Paolo Bonzini <pbonzini@xxxxxxxxxx> wrote:

On 29/04/21 23:18, Ben Gardon wrote:
+void activate_shadow_mmu(struct kvm *kvm)
+ kvm->arch.shadow_mmu_active = true;

I think there's no lock protecting both the write and the read side.
Therefore this should be an smp_store_release, and all checks in
patch 2 should be an smp_load_acquire.

That makes sense.

Also, the assignments to slot->arch.rmap in patch 4 (alloc_memslot_rmap)
should be an rcu_assign_pointer, while __gfn_to_rmap must be changed like so:

+ struct kvm_rmap_head *head;
- return &slot->arch.rmap[level - PG_LEVEL_4K][idx];
+ head = srcu_dereference(slot->arch.rmap[level - PG_LEVEL_4K], &kvm->srcu,
+ lockdep_is_held(&kvm->slots_arch_lock));
+ return &head[idx];

I'm not sure I fully understand why this becomes necessary after patch
4. Isn't it already needed since the memslots are protected by RCU? Or
is there already a higher level rcu dereference?

__kvm_memslots already does an srcu dereference, so is there a path
where we aren't getting the slots from that function where this is

There are two point of views:

1) the easier one is just CONFIG_PROVE_RCU debugging: the rmaps need to be accessed under RCU because the memslots can disappear as soon as kvm->srcu is unlocked.

2) the harder one (though at this point I'm better at figuring out these ordering bugs than "traditional" mutex races) is what the happens before relation[1] looks like. Consider what happens if the rmaps are allocated by *another thread* after the slots have been fetched.

thread 1 thread 2 thread 3
allocate memslots
slots = srcu_dereference
allocate rmap
head = slot->arch.rmap[]

Here, thread 3 is allocating the rmaps in the SRCU-protected kvm_memslots; those rmaps that didn't exist at the time thread 1 did the rcu_assign_pointer (which synchronizes with thread 2's srcu_dereference that retrieves slots), hence they were not covered by the release semantics of that rcu_assign_pointer and the "consume" semantics of the corresponding srcu_dereference. Therefore, thread 2 needs another srcu_dereference when retrieving them.



I wouldn't say that the rmaps are protected by RCU in any way that
separate from the memslots.