Re: [PATCH tip/core/rcu 6/7] rcu: Make expedited grace periods recheck dyntick idle state
From: Paul E. McKenney
Date: Tue Nov 15 2016 - 09:36:17 EST
On Tue, Nov 15, 2016 at 09:16:55AM +0100, Peter Zijlstra wrote:
> On Mon, Nov 14, 2016 at 10:12:37AM -0800, Paul E. McKenney wrote:
> > On Mon, Nov 14, 2016 at 06:37:33PM +0100, Peter Zijlstra wrote:
> > > On Mon, Nov 14, 2016 at 09:25:12AM -0800, Josh Triplett wrote:
> > > > On Mon, Nov 14, 2016 at 08:57:12AM -0800, Paul E. McKenney wrote:
> > > > > Expedited grace periods check dyntick-idle state, and avoid sending
> > > > > IPIs to idle CPUs, including those running guest OSes, and, on NOHZ_FULL
> > > > > kernels, nohz_full CPUs. However, the kernel has been observed checking
> > > > > a CPU while it was non-idle, but sending the IPI after it has gone
> > > > > idle. This commit therefore rechecks idle state immediately before
> > > > > sending the IPI, refraining from IPIing CPUs that have since gone idle.
> > > > >
> > > > > Reported-by: Rik van Riel <riel@xxxxxxxxxx>
> > > > > Signed-off-by: Paul E. McKenney <paulmck@xxxxxxxxxxxxxxxxxx>
> > > >
> > > > atomic_add_return(0, ...) seems odd. Do you actually want that, rather
> > > > than atomic_read(...)? If so, can you please document exactly why?
> > >
> > > Yes that is weird. The only effective difference is that it would do a
> > > load-exclusive instead of a regular load.
> >
> > It is weird, and checking to see if it is safe to convert it and its
> > friends to something with less overhead is on my list. This starts
> > with a patch series I will post soon that consolidates all these
> > atomic_add_return() calls into a single function, which will ease testing
> > and other verification.
> >
> > All that aside, please keep in mind that much is required from this load.
> > It is part of a network of ordered operations that guarantee that any
> > operation from any CPU preceding a given grace period is seen to precede
> > any other operation from any CPU following that same grace period.
> > And each and every CPU must agree on the order of those two operations,
> > otherwise, RCU is broken.
>
> OK, so something similar to:
>
> smp_mb();
> atomic_read();
>
> then? That would order, with global transitivity, against prior
> operations.
Maybe. The consolidation in the later patch series is a first step
towards potential weakening.
> > In addition, please note also that these operations are nowhere near
> > any fastpaths.
>
> My concern is mostly that it reads very weird. I appreciate this not
> being fast path code, but confusing code is bad in any form.
It is the long-standing code that has been checking dyntick-idle counters
for quite some time. Just applying that same code to a new use case
in within the expedited grace periods, as you can see by looking a bit
earlier in that same function.
Thanx, Paul