Re: [RFC PATCH] v3 RCU implementation with fast grace periods

From: Mathieu Desnoyers
Date: Wed Apr 22 2009 - 11:50:25 EST


* Paul E. McKenney (paulmck@xxxxxxxxxxxxxxxxxx) wrote:
> On Tue, Apr 21, 2009 at 05:11:58PM -0400, Mathieu Desnoyers wrote:
> > * Paul E. McKenney (paulmck@xxxxxxxxxxxxxxxxxx) wrote:
> > > On Tue, Apr 21, 2009 at 11:10:35AM -0400, Mathieu Desnoyers wrote:
> > > > * Paul E. McKenney (paulmck@xxxxxxxxxxxxxxxxxx) wrote:
> > >
> > > [ . . . ]
> > >
> > > > > +void synchronize_rcu_fgp(void)
> > > > > +{
> > > > > + mutex_lock(&rcu_fgp_mutex);
> > > > > +
> > > > > + /* CPUs must see earlier change before parity flip. */
> > > > > + smp_call_function(rcu_fgp_do_mb, NULL, 1);
> > > > > +
> > > >
> > > > Hrm, my original comment about missing smp_mb() here still applies, I
> > > > don't think we have come to an agreement yet.
> > >
> > > My argument is that smp_call_function() must necessarily contain a
> > > full memory barrier, otherwise it cannot function properly. ;-)
> > >
> >
> > Looking at :
> >
> > kernel/smp.c
> >
> > smp_call_function_many() indeed has a smp_mb(). It is called by
> > smp_call_function(). I wonder if it could eventually be turned into a
> > smp_wmb() instead ? If this is even a remote possibility, then the fact
> > that
> >
> > - The rcu_fgp code does not document that it expects smp_call_function()
> > to have a smp_mb().
> > - The fact that smp_call_function_many() comments do not state that this
> > function provides the guarantee to run a smp_mb().
> >
> > are both asking for an eventual bug to creep into the kernel.
>
> Many bugs -- I believe that a number of users of smp_call_function()
> assume that it maintains ordering between the calling code and all
> invocations of the function passed to smp_call_function().
>
> > So your assumption seems OK, but I think it needs to be explicitly
> > documented.
>
> That might well be a good thing.
>
> Thanx, Paul

And while we are at it : you should probably add lockdep annotation to
this new lock.

I never thought I would say this, but following the discussion going on
about netfilter locking, I am starting to think that the RCU approach
might be more simple and elegant that the nestable per-cpu rwlock
approches proposed so far. ;)

Plus, there is much fewer name calling involved in the making. :)

Cheers,

Mathieu

>
> > Mathieu
> >
> > > > > + /*
> > > > > + * We must flip twice to correctly handle tasks that stall
> > > > > + * in rcu_read_lock_fgp() between the time that they fetch
> > > > > + * rcu_fgp_ctr and the time that the store to their CPU's
> > > > > + * rcu_fgp_active_readers. No matter when they resume
> > > > > + * execution, we will wait for them to get to the corresponding
> > > > > + * rcu_read_unlock_fgp().
> > > > > + */
> > > > > + ACCESS_ONCE(rcu_fgp_ctr) ^= RCU_FGP_PARITY; /* flip parity 0 -> 1 */
> > > > > + rcu_fgp_wait_for_quiescent_state(); /* wait for old readers */
> > > > > + ACCESS_ONCE(rcu_fgp_ctr) ^= RCU_FGP_PARITY; /* flip parity 1 -> 0 */
> > > > > + rcu_fgp_wait_for_quiescent_state(); /* wait for old readers */
> > > > > +
> > > > > + /* Prevent CPUs from reordering out of prior RCU critical sections. */
> > > > > + smp_call_function(rcu_fgp_do_mb, NULL, 1);
> > > > > +
> > > >
> > > > Same here.
> > > >
> > > > So we would need to either add a smp_mb() at both of these locations, or
> > > > use on_each_cpu() rather than smp_call_function. Note that this is to
> > > > ensure that the "updater" thread executes these memory barriers.
> > >
> > > Or rely on the barriers that must be part of smp_call_function. ;-)
> > >
> > > Thanx, Paul
> > >
> > > > Mathieu
> > > >
> > > >
> > > > > + rcu_fgp_completed++;
> > > > > + mutex_unlock(&rcu_fgp_mutex);
> > > > > +}
> > > > > +EXPORT_SYMBOL_GPL(synchronize_rcu_fgp);
> > > > > +
> > > > > +/**
> > > > > + * rcu_fgp_batches_completed - return batches completed.
> > > > > + * @sp: srcu_struct on which to report batch completion.
> > > > > + *
> > > > > + * Report the number of batches, correlated with, but not necessarily
> > > > > + * precisely the same as, the number of grace periods that have elapsed.
> > > > > + */
> > > > > +long rcu_fgp_batches_completed(void)
> > > > > +{
> > > > > + return rcu_fgp_completed;
> > > > > +}
> > > > > +EXPORT_SYMBOL_GPL(rcu_fgp_batches_completed);
> > > >
> > > > --
> > > > Mathieu Desnoyers
> > > > OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68
> >
> > --
> > Mathieu Desnoyers
> > OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68
> > --
> > To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
> > the body of a message to majordomo@xxxxxxxxxxxxxxx
> > More majordomo info at http://vger.kernel.org/majordomo-info.html

--
Mathieu Desnoyers
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/