Re: [PATCH tip/core/rcu 1/9] rcu: Provide GP ordering in face of migrations and delays

From: Paul E. McKenney
Date: Thu Oct 05 2017 - 14:22:21 EST

On Thu, Oct 05, 2017 at 06:25:14PM +0200, Peter Zijlstra wrote:
> On Thu, Oct 05, 2017 at 09:19:09AM -0700, Paul E. McKenney wrote:
> >
> > Yes, the ordering does need to be visible to uninvolved CPUs, so
> > release-acquire is not necessarily strong enough.
> Why? Because I'm not at all seeing why it needs more. Your Changelog
> doesn't provide enough hints.

Hmmm... Here is what I was worried about:

C C-PaulEMcKenney-W+RWC4+2017-10-05


P0(int *a, int *x)
WRITE_ONCE(*a, 1);
smp_mb(); /* Lock acquisition for rcu_node ->lock. */
WRITE_ONCE(*x, 1);

P1(int *x, int *y, spinlock_t *l)
r3 = READ_ONCE(*x);
smp_mb(); /* Lock acquisition for rcu_node ->lock. */
spin_lock(l); /* Locking in complete(). */
WRITE_ONCE(*y, 1);

P2(int *y, int *b, spinlock_t *l)
spin_lock(l); /* Locking in wait_for_completion. */
r4 = READ_ONCE(*y);
r1 = READ_ONCE(*b);

P3(int *b, int *a)
WRITE_ONCE(*b, 1);
r2 = READ_ONCE(*a);

exists (1:r3=1 /\ 2:r4=1 /\ 2:r1=0 /\ 3:r2=0)

My concern was that P2()'s read from b could be pulled into the lock's
critical section and reordered with the read from y. But even if
you physically rearrange the code, the cycle is forbidden. Removing
P2()'s lock acquisition and release does allow the cycle. I get the
same result when replacing spin_lock() with xchg_acquire() and
spin_unlock() with smp_store_release(). I can drop P1()'s smp_mb()
(but otherwise like the original above) and the cycle is forbidden.
Removing P1()'s lock acquisition and release, but leaving the smp_mb(),
does allow the cycle.

So it looks like I can drop this patch entirely. Though somewhat
nervously, despite the evidence that I have ample redundancy in
ordering. ;-)

So, thank you for persisting on this one!

Thanx, Paul