Re: [PATCH RFC] rcu: Make rcu_barrier() less disruptive

From: Paul E. McKenney
Date: Thu Mar 15 2012 - 14:40:13 EST


On Thu, Mar 15, 2012 at 11:21:59AM -0700, Paul E. McKenney wrote:
> On Thu, Mar 15, 2012 at 01:45:27PM -0400, Mathieu Desnoyers wrote:
> > * Paul E. McKenney (paulmck@xxxxxxxxxxxxxxxxxx) wrote:
> > > The rcu_barrier() primitive interrupts each and every CPU, registering
> > > a callback on every CPU. Once all of these callbacks have been invoked,
> > > rcu_barrier() knows that every callback that was registered before
> > > the call to rcu_barrier() has also been invoked.
> > >
> > > However, there is no point in registering a callback on a CPU that
> > > currently has no callbacks, most especially if that CPU is in a
> > > deep idle state. This commit therefore makes rcu_barrier() avoid
> > > interrupting CPUs that have no callbacks. Doing this requires reworking
> > > the handling of orphaned callbacks, otherwise callbacks could slip through
> > > rcu_barrier()'s net by being orphaned from a CPU that rcu_barrier() had
> > > not yet interrupted to a CPU that rcu_barrier() had already interrupted.
> > > This reworking was needed anyway to take a first step towards weaning
> > > RCU from the CPU_DYING notifier's use of stop_cpu().
> >
> > Quoting Documentation/RCU/rcubarrier.txt:
> >
> > "We instead need the rcu_barrier() primitive. This primitive is similar
> > to synchronize_rcu(), but instead of waiting solely for a grace
> > period to elapse, it also waits for all outstanding RCU callbacks to
> > complete. Pseudo-code using rcu_barrier() is as follows:"
> >
> > The patch you propose seems like a good approach to make rcu_barrier
> > less disruptive, but everyone need to be aware that rcu_barrier() would
> > quit having the side-effect of doing the equivalent of
> > "synchronize_rcu()" from now on: within this new approach, in the case
> > where there are no pending callbacks, rcu_barrier() could, AFAIU, return
> > without waiting for the current grace period to complete.
> >
> > Any use of rcu_barrier() that would assume that a synchronize_rcu() is
> > implicit with the rcu_barrier() execution would be a bug anyway, but
> > those might only show up after this patch is applied. I would therefore
> > recommend to audit all rcu_barrier() users to ensure none is expecting
> > rcu_barrier to act as a synchronize_rcu before pushing this change.
>
> Good catch!
>
> I am going to chicken out and explicitly wait for a grace period if there
> were no callbacks. Having rcu_barrier() very rarely be a quick no-op does
> sound like a standing invitation for subtle non-reproducible bugs. ;-)

I take it back...

After adopting callbacks (rcu_adopt_orphan_cbs()), _rcu_barrier()
unconditionally posts a callback on the current CPU and waits for it.
So _rcu_barrier() actually does always wait for a grace period.

Yes, I could be more dainty and make rcu_adopt_orphan_cbs() return an
indication of whether there were any callbacks, and then post the callback
only if either there were some callbacks adopted or if there were no calls
to smp_call_function_single(). But that adds complexity for almost no
benefit -- and no one can accuse _rcu_barrier() of being a fastpath! ;-)

Or am I missing something here?

Thanx, Paul

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/