Re: [PATCH v3 10/22] KVM: Drop kvm_count_lock and instead protect kvm_usage_count with kvm_lock

From: Isaku Yamahata
Date: Thu Sep 08 2022 - 14:25:08 EST


On Tue, Sep 06, 2022 at 02:44:34PM -0700,
Isaku Yamahata <isaku.yamahata@xxxxxxxxx> wrote:

> On Tue, Sep 06, 2022 at 07:32:22AM +0100,
> Marc Zyngier <maz@xxxxxxxxxx> wrote:
>
> > On Tue, 06 Sep 2022 03:46:43 +0100,
> > Yuan Yao <yuan.yao@xxxxxxxxxxxxxxx> wrote:
> > >
> > > On Thu, Sep 01, 2022 at 07:17:45PM -0700, isaku.yamahata@xxxxxxxxx wrote:
> > > > From: Isaku Yamahata <isaku.yamahata@xxxxxxxxx>
> > > >
> > > > Because kvm_count_lock unnecessarily complicates the KVM locking convention
> > > > Drop kvm_count_lock and instead protect kvm_usage_count with kvm_lock for
> > > > simplicity.
> > > >
> > > > Opportunistically add some comments on locking.
> > > >
> > > > Suggested-by: Sean Christopherson <seanjc@xxxxxxxxxx>
> > > > Signed-off-by: Isaku Yamahata <isaku.yamahata@xxxxxxxxx>
> > > > ---
> > > > Documentation/virt/kvm/locking.rst | 14 +++++-------
> > > > virt/kvm/kvm_main.c | 34 ++++++++++++++++++++----------
> > > > 2 files changed, 28 insertions(+), 20 deletions(-)
> > > >
> > > > diff --git a/Documentation/virt/kvm/locking.rst b/Documentation/virt/kvm/locking.rst
> > > > index 845a561629f1..8957e32aa724 100644
> > > > --- a/Documentation/virt/kvm/locking.rst
> > > > +++ b/Documentation/virt/kvm/locking.rst
> > > > @@ -216,15 +216,11 @@ time it will be set using the Dirty tracking mechanism described above.
> > > > :Type: mutex
> > > > :Arch: any
> > > > :Protects: - vm_list
> > > > -
> > > > -``kvm_count_lock``
> > > > -^^^^^^^^^^^^^^^^^^
> > > > -
> > > > -:Type: raw_spinlock_t
> > > > -:Arch: any
> > > > -:Protects: - hardware virtualization enable/disable
> > > > -:Comment: 'raw' because hardware enabling/disabling must be atomic /wrt
> > > > - migration.
> > > > + - kvm_usage_count
> > > > + - hardware virtualization enable/disable
> > > > +:Comment: Use cpus_read_lock() for hardware virtualization enable/disable
> > > > + because hardware enabling/disabling must be atomic /wrt
> > > > + migration. The lock order is cpus lock => kvm_lock.
> > > >
> > > > ``kvm->mn_invalidate_lock``
> > > > ^^^^^^^^^^^^^^^^^^^^^^^^^^^
> > > > diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
> > > > index fc55447c4dba..082d5dbc8d7f 100644
> > > > --- a/virt/kvm/kvm_main.c
> > > > +++ b/virt/kvm/kvm_main.c
> > > > @@ -100,7 +100,6 @@ EXPORT_SYMBOL_GPL(halt_poll_ns_shrink);
> > > > */
> > > >
> > > > DEFINE_MUTEX(kvm_lock);
> > > > -static DEFINE_RAW_SPINLOCK(kvm_count_lock);
> > > > LIST_HEAD(vm_list);
> > > >
> > > > static cpumask_var_t cpus_hardware_enabled;
> > > > @@ -4996,6 +4995,8 @@ static void hardware_enable_nolock(void *caller_name)
> > > > int cpu = raw_smp_processor_id();
> > > > int r;
> > > >
> > > > + WARN_ON_ONCE(preemptible());
> > >
> > > This looks incorrect, it may triggers everytime when online CPU.
> > > Because patch 7 moved CPUHP_AP_KVM_STARTING *AFTER*
> > > CPUHP_AP_ONLINE_IDLE as CPUHP_AP_KVM_ONLINE, then cpuhp_thread_fun()
> > > runs the new CPUHP_AP_KVM_ONLINE in *non-atomic* context:
> > >
> > > cpuhp_thread_fun(unsigned int cpu) {
> > > ...
> > > if (cpuhp_is_atomic_state(state)) {
> > > local_irq_disable();
> > > st->result = cpuhp_invoke_callback(cpu, state, bringup, st->node, &st->last);
> > > local_irq_enable();
> > >
> > > WARN_ON_ONCE(st->result);
> > > } else {
> > > st->result = cpuhp_invoke_callback(cpu, state, bringup, st->node, &st->last);
> > > }
> > > ...
> > > }
> > >
> > > static bool cpuhp_is_atomic_state(enum cpuhp_state state)
> > > {
> > > return CPUHP_AP_IDLE_DEAD <= state && state < CPUHP_AP_ONLINE;
> > > }
> > >
> > > The hardware_enable_nolock() now is called in 2 cases:
> > > 1. in atomic context by on_each_cpu().
> > > 2. From non-atomic context by CPU hotplug thread.
> > >
> > > so how about "WARN_ONCE(preemptible() && cpu_active(cpu))" ?
> >
> > I suspect similar changes must be applied to the arm64 side (though
> > I'm still looking for a good definition of cpu_active()).
>
> It seems plausible. I tested cpu online/offline on x86. Let me update arm64 code
> too.

On second thought, I decided to add preempt_disable/enable() instead of fixing
up possible arch callback and let each arch handle it.
--
Isaku Yamahata <isaku.yamahata@xxxxxxxxx>