Re: FW: Commit 81a43adae3b9 (locking/mutex: Use acquire/release semantics) causing failures on arm64 (ThunderX)

From: Will Deacon
Date: Fri Dec 11 2015 - 07:18:00 EST


On Fri, Dec 11, 2015 at 01:13:19PM +0100, Peter Zijlstra wrote:
> On Fri, Dec 11, 2015 at 12:04:19PM +0000, Will Deacon wrote:
> > I think Andrew meant the atomic_xchg_acquire at the start of osq_lock,
> > as opposed to "compare and swap". In which case, it does look like
> > there's a bug here because there is nothing to order the initialisation
> > of the node fields with publishing of the node, whether that's
> > indirectly as a result of setting the tail to the current CPU or
> > directly as a result of the WRITE_ONCE.
>
> Agreed, this does indeed look like a bug. If confirmed please write a
> shiny changelog and I'll queue asap.

Yup. I've failed to reproduce the issue locally, so we'll need to wait
for Andrew and/or David to get back to us first.

Will

> > diff --git a/kernel/locking/osq_lock.c b/kernel/locking/osq_lock.c
> > index d092a0c9c2d4..05a37857ab55 100644
> > --- a/kernel/locking/osq_lock.c
> > +++ b/kernel/locking/osq_lock.c
> > @@ -93,10 +93,12 @@ bool osq_lock(struct optimistic_spin_queue *lock)
> > node->cpu = curr;
> >
> > /*
> > - * ACQUIRE semantics, pairs with corresponding RELEASE
> > - * in unlock() uncontended, or fastpath.
> > + * We need both ACQUIRE (pairs with corresponding RELEASE in
> > + * unlock() uncontended, or fastpath) and RELEASE (to publish
> > + * the node fields we just initialised) semantics when updating
> > + * the lock tail.
> > */
> > - old = atomic_xchg_acquire(&lock->tail, curr);
> > + old = atomic_xchg(&lock->tail, curr);
> > if (old == OSQ_UNLOCKED_VAL)
> > return true;
> >
>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/