Re: [PATCH] kvm: x86: keep srcu writer side operation mutually exclusive

From: Sean Christopherson
Date: Mon Oct 10 2022 - 13:38:58 EST


On Sun, Oct 09, 2022, Hao Peng wrote:
> On Sat, Oct 8, 2022 at 1:12 AM Sean Christopherson <seanjc@xxxxxxxxxx> wrote:
> >
> > On Sat, Oct 08, 2022, Hao Peng wrote:
> > > From: Peng Hao <flyingpeng@xxxxxxxxxxx>
> > >
> > > Synchronization operations on the writer side of SRCU should be
> > > invoked within the mutex.
> >
> > Why? Synchronizing SRCU is necessary only to ensure that all previous readers go
> > away before the old filter is freed. There's no need to serialize synchronization
> > between writers. The mutex ensures each writer operates on the "new" filter that's
> > set by the previous writer, i.e. there's no danger of a double-free. And the next
> > writer will wait for readers to _its_ "new" filter.
> >
> Array srcu_lock_count/srcu_unlock_count[] in srcu_data, which is used
> alternately to determine
> which readers need to wait to get out of the critical area. If two
> synchronize_srcu are initiated concurrently,
> there may be a problem with the judgment of gp. But if it is confirmed
> that there will be no writer concurrency,
> it is not necessary to ensure that synchronize_srcu is executed within
> the scope of the mutex lock.

I don't see anything in the RCU documentation or code that suggests that callers
need to serialize synchronization calls. E.g. the "tree" SRCU implementation uses
a dedicated mutex to serialize grace period work

struct mutex srcu_gp_mutex; /* Serialize GP work. */

static void srcu_advance_state(struct srcu_struct *ssp)
{
int idx;

mutex_lock(&ssp->srcu_gp_mutex);

<magic>
}


and its state machine explicitly accounts for "Someone else" starting a grace
period

if (idx != SRCU_STATE_IDLE) {
mutex_unlock(&ssp->srcu_gp_mutex);
return; /* Someone else started the grace period. */
}

and srcu_gp_end() also guards against creating more than 2 grace periods.

/* Prevent more than one additional grace period. */
mutex_lock(&ssp->srcu_cb_mutex);

And if this is a subtle requirement, there is a lot of broken kernel code, e.g.
mmu_notifier, other KVM code, srcu_notifier_chain_unregister(), etc...