Re: [PATCH 16/18] KVM: Don't take mmu_lock for range invalidation unless necessary

From: Sean Christopherson
Date: Wed Mar 31 2021 - 15:50:21 EST


On Wed, Mar 31, 2021, Paolo Bonzini wrote:
> On 31/03/21 18:41, Sean Christopherson wrote:
> > > That said, the easiest way to avoid this would be to always update
> > > mmu_notifier_count.
> > Updating mmu_notifier_count requires taking mmu_lock, which would defeat the
> > purpose of these shenanigans.
>
> Okay; I wasn't sure if the problem was contention with page faults in
> general, or just the long critical sections from the MMU notifier callbacks.
> Still updating mmu_notifier_count unconditionally is a good way to break up
> the patch in two and keep one commit just for the rwsem nastiness.

Rereading things, a small chunk of the rwsem nastiness can go away. I don't see
any reason to use rw_semaphore instead of rwlock_t. install_new_memslots() only
holds the lock for a handful of instructions. Readers could get queued up
behind a writer, but since install_new_memslots() is serialized by slots_lock
(the existing mutex), there is no chance of multiple writers, i.e. the worst
case wait duration is bounded at the length of an in-flight notification. And
that's _already_ the worst case since notifications are currently serialized by
mmu_lock. In practice, the existing worst case is probably far worse since
there can be far more writers trying to acquire mmu_lock.

In other words, there's no strong argument for sleeping instead of busy waiting
in the notifiers.

By switching to rwlock_t, taking mmu_notifier_slots_lock doesn't have to depend
on mmu_notifier_range_blockable(), and the must_lock path also goes away.

> > > > +#if defined(CONFIG_MMU_NOTIFIER) && defined(KVM_ARCH_WANT_MMU_NOTIFIER)
> > > > + down_write(&kvm->mmu_notifier_slots_lock);
> > > > +#endif
> > > > rcu_assign_pointer(kvm->memslots[as_id], slots);
> > > > +#if defined(CONFIG_MMU_NOTIFIER) && defined(KVM_ARCH_WANT_MMU_NOTIFIER)
> > > > + up_write(&kvm->mmu_notifier_slots_lock);
> > > > +#endif
> > > Please do this unconditionally, the cost is minimal if the rwsem is not
> > > contended (as is the case if the architecture doesn't use MMU notifiers at
> > > all).
> > It's not the cost, it's that mmu_notifier_slots_lock doesn't exist. That's an
> > easily solved problem, but then the lock wouldn't be initialized since
> > kvm_init_mmu_notifier() is a nop. That's again easy to solve, but IMO would
> > look rather weird. I guess the counter argument is that __kvm_memslots()
> > wouldn't need #ifdeffery.
>
> Yep. Less #ifdefs usually wins. :)
>
> > These are the to ideas I've come up with:
> >
> > Option 1:
> > static int kvm_init_mmu_notifier(struct kvm *kvm)
> > {
> > init_rwsem(&kvm->mmu_notifier_slots_lock);
> >
> > #if defined(CONFIG_MMU_NOTIFIER) && defined(KVM_ARCH_WANT_MMU_NOTIFIER)
> > kvm->mmu_notifier.ops = &kvm_mmu_notifier_ops;
> > return mmu_notifier_register(&kvm->mmu_notifier, current->mm);
> > #else
> > return 0;
> > #endif
> > }
>
> Option 2 is also okay I guess, but the simplest is option 1 + just init it
> in kvm_create_vm.

Arr. I'll play around with it to try and purge the #ifdefs.