Re: [PATCH v3 7/8] KVM: x86/mmu: Protect rmaps independently with SRCU

From: Sean Christopherson
Date: Mon May 10 2021 - 14:28:47 EST


On Mon, May 10, 2021, Paolo Bonzini wrote:
> On 10/05/21 19:45, Sean Christopherson wrote:
> > >
> > > ---------
> > > Currently, rmaps are always allocated and published together with a new
> > > memslot, so the srcu_dereference for the memslots array already ensures that
> > > the memory pointed to by slots->arch.rmap is zero at the time
> > > slots->arch.rmap. However, they still need to be accessed in an SRCU
> > > read-side critical section, as the whole memslot can be deleted outside
> > > SRCU.
> > > --------
> > I disagree, sprinkling random and unnecessary __rcu/SRCU annotations does more
> > harm than good. Adding the unnecessary tag could be quite misleading as it
> > would imply the rmap pointers can_change_ independent of the memslots.
> >
> > Similary, adding rcu_assign_pointer() in alloc_memslot_rmap() implies that its
> > safe to access the rmap after its pointer is assigned, and that's simply not
> > true since an rmap array can be freed if rmap allocation for a different memslot
> > fails. Accessing the rmap is safe if and only if all rmaps are allocated, i.e.
> > if arch.memslots_have_rmaps is true, as you pointed out.
>
> This about freeing is a very good point.
>
> > Furthermore, to actually gain any protection from SRCU, there would have to be
> > an synchronize_srcu() call after assigning the pointers, and that _does_ have an
> > associated.
>
> ... but this is incorrect (I was almost going to point out the below in my
> reply to Ben, then decided I was pointing out the obvious; lesson learned).
>
> synchronize_srcu() is only needed after *deleting* something, which in this

No, synchronization is required any time the writer needs to ensure readers have
recognized the change. E.g. making a memslot RO, moving a memslot's gfn base,
adding an MSR to the filter list. I suppose you could frame any modification as
"deleting" something, but IMO that's cheating :-)

> case is done as part of deleting the memslots---it's perfectly fine to batch
> multiple synchronize_*() calls given how expensive some of them are.

Yes, but the shortlog says "Protect rmaps _independently_ with SRCU", emphasis
mine. If the rmaps are truly protected independently, then they need to have
their own synchronization. Setting all rmaps could be batched under a single
synchronize_srcu(), but IMO batching the rmaps with the memslot itself would be
in direct contradiction with the shortlog.

> (BTW an associated what?)

Doh. "associated memslot."

> So they still count as RCU-protected in my opinion, just because reading
> them outside SRCU is a big no and ought to warn (it's unlikely that it
> happens with rmaps, but then we just had 2-3 bugs like this being reported
> in a short time for memslots so never say never).

Yes, but that interpretation holds true for literally everything that is hidden
behind an SRCU-protected pointer. E.g. this would also be wrong, it's just much
more obviously broken:

bool kvm_is_gfn_writable(struct kvm* kvm, gfn_t gfn)
{
struct kvm_memory_slot *slot;
int idx;

idx = srcu_read_lock(&kvm->srcu);
slot = gfn_to_memslot(kvm, gfn);
srcu_read_unlock(&kvm->srcu);

return slot && !(slot->flags & KVM_MEMSLOT_INVALID) &&
!(slot->flags & KVM_MEM_READONLY);
}


> However, rcu_assign_pointer is not needed because the visibility of the rmaps
> is further protected by the have-rmaps flag (to be accessed with
> load-acquire/store-release) and not just by the pointer being there and
> non-NULL.

Yes, and I'm arguing that annotating the rmaps as __rcu is wrong because they
themselves are not protected by SRCU. The memslot that contains the rmaps is
protected by SRCU, and because of that asserting SRCU is held for read will hold
true. But, if the memslot code were changed to use a different protection scheme,
e.g. a rwlock for argument's sake, then the SRCU assertion would fail even though
the rmap logic itself didn't change.