Re: [PATCH 1/4] spinlock: Document memory barrier rules

From: Manfred Spraul
Date: Wed Aug 31 2016 - 14:32:31 EST

Next message: Tero Kristo: "Re: [PATCH 3/3] clk: keystone: Add sci-clk driver support"
Previous message: Andrew Morton: "Re: [PATCH] Update my e-mail address"
In reply to: Will Deacon: "Re: [PATCH 1/4] spinlock: Document memory barrier rules"
Next in thread: Peter Zijlstra: "Re: [PATCH 0/4] Clarify/standardize memory barriers for lock/unlock"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

On 08/31/2016 06:40 PM, Will Deacon wrote:

I'm struggling with this example. We have these locks:

&sem->lock
&sma->sem_base[0...sma->sem_nsems].lock
&sma->sem_perm.lock

a condition variable:

sma->complex_mode

and a new barrier:

smp_mb__after_spin_lock()

For simplicity, we can make sma->sem_nsems == 1, and have &sma->sem_base[0]
be &sem->lock in the example above.

Correct.

&sma->sem_perm.lock seems to be
irrelevant.

Correct.

The litmus test then looks a bit like:

CPUm:

LOCK(x)
smp_mb();
RyAcq=0

CPUn:

Wy=1
smp_mb();
UNLOCK_WAIT(x)

Correct.

which I think can be simplified to:

LOCK(x)

I thought that here a barrier is required, because Ry=0 could be before store of the lock.

Ry=0

RyAcq instead of Ry would required due to the unlock at the end of the critical section
CpuN: <...>
WyRelease=0
for the litmus test irrelevant.

Wy=1
smp_mb(); // Note that this is implied by spin_unlock_wait on PPC and arm64
LOCK(x) // spin_unlock_wait behaves like lock; unlock
UNLOCK(x)

[I've removed a bunch of barriers here, that I don't think are necessary
for the guarantees you're after]

and the question is "Can both CPUs proceed?".

Looking at the above, then I don't think that they can. Whilst CPUm can
indeed speculate the Ry=0 before successfully taking the lock, if CPUn
observes CPUm's read, then it must also observe the lock being held wrt
the spin_lock API. That is because a successful LOCK operation by CPUn
would force CPUm to replay its LL/SC loop and therefore discard its
speculation of y.

What am I missing? The code snippet seems to have too many barriers to me!

spin_unlock_wait() is not necessarily lock()+unlock().
It can be a simple Rx, or now RxAcq.

So I had assumed:

CPUm:

LOCK(x)
smp_mb(); /* at least for PPC, therefore with arch override */
RyAcq=0

CPUn:

Wy=1
smp_mb(); /* at least for archs where UNLOCK_WAIT is RxAcq */
UNLOCK_WAIT(x)
smp_rmb(); /* not required anymore, was required when UNLOCK_WAIT was Rx */

--
Manfred

Next message: Tero Kristo: "Re: [PATCH 3/3] clk: keystone: Add sci-clk driver support"
Previous message: Andrew Morton: "Re: [PATCH] Update my e-mail address"
In reply to: Will Deacon: "Re: [PATCH 1/4] spinlock: Document memory barrier rules"
Next in thread: Peter Zijlstra: "Re: [PATCH 0/4] Clarify/standardize memory barriers for lock/unlock"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]