Re: [PATCH] rwsem: fix missed wakeup due to reordering of load
From: Peter Zijlstra
Date: Thu Aug 10 2017 - 06:41:30 EST
On Thu, Aug 10, 2017 at 10:32:56AM +0200, Andrea Parri wrote:
> On Thu, Jul 27, 2017 at 11:48:53AM -0400, Waiman Long wrote:
> > On 07/26/2017 04:17 PM, Prateek Sood wrote:
> > > If a spinner is present, there is a chance that the load of
> > > rwsem_has_spinner() in rwsem_wake() can be reordered with
> > > respect to decrement of rwsem count in __up_write() leading
> > > to wakeup being missed.
> > >
> > > spinning writer up_write caller
> > > --------------- -----------------------
> > > [S] osq_unlock() [L] osq
> > > spin_lock(wait_lock)
> > > sem->count=0xFFFFFFFF00000001
> > > +0xFFFFFFFF00000000
> > > count=sem->count
> > > MB
> > > sem->count=0xFFFFFFFE00000001
> > > -0xFFFFFFFF00000001
> > > spin_trylock(wait_lock)
> > > return
> > > rwsem_try_write_lock(count)
> > > spin_unlock(wait_lock)
> > > schedule()
> > >
> > > Reordering of atomic_long_sub_return_release() in __up_write()
> > > and rwsem_has_spinner() in rwsem_wake() can cause missing of
> > > wakeup in up_write() context. In spinning writer, sem->count
> > > and local variable count is 0XFFFFFFFE00000001. It would result
> > > in rwsem_try_write_lock() failing to acquire rwsem and spinning
> > > writer going to sleep in rwsem_down_write_failed().
> > >
> > > The smp_rmb() will make sure that the spinner state is
> > > consulted after sem->count is updated in up_write context.
> > >
> > > Signed-off-by: Prateek Sood <prsood@xxxxxxxxxxxxxx>
> >
> > Did you actually observe that the reordering happens?
> >
> > I am not sure if some architectures can actually speculatively execute
> > instruction ahead of a branch and then ahead into a function call. I
> > know it can happen if the function call is inlined, but rwsem_wake()
> > will not be inlined into __up_read() or __up_write().
>
> Branches/control dependencies targeting a read do not necessarily preserve
> program order; this is for example the case for PowerPC and ARM.
>
> I'd not expect more than a compiler barrier from a function call (in fact,
> not even that if the function happens to be inlined).
Indeed. Reads can be speculated by deep out-of-order CPUs no problem.
That's what branch predictors are for.
> > Even if that is the case, I am not sure if smp_rmb() alone is enough to
> > guarantee the ordering as I think it will depend on how the
> > atomic_long_sub_return_release() is implmented.
>
> AFAICT, the pattern under discussion is MP with:
>
> - a store-release to osq->tail(unlock) followed by a store to sem->count,
> separated by a MB (from atomic_long_add_return()) on CPU0;
>
> - a load of sem->count (for atomic_long_sub_return_release()) followed by
Which is a regular load, as 'release' only need apply to the store.
> a load of osq->tail (rwsem_has_spinner()) on CPU1.
>
> Thus a RMW between the two loads suffices to forbid the weak behaviour.
Agreed.