Re: [PATCH 1/4] spinlock: Document memory barrier rules

From: Manfred Spraul
Date: Thu Sep 01 2016 - 07:04:36 EST

Next message: Colin King: "[PATCH] [media] rc/streamzap: fix spelling mistake "sumbiting" -> "submitting""
Previous message: Baoyou Xie: "[PATCH] virtio: mark vring_dma_dev() static"
In reply to: Peter Zijlstra: "Re: [PATCH 1/4] spinlock: Document memory barrier rules"
Next in thread: Will Deacon: "Re: [PATCH 1/4] spinlock: Document memory barrier rules"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

Hi,

On 09/01/2016 10:44 AM, Peter Zijlstra wrote:

On Wed, Aug 31, 2016 at 08:32:18PM +0200, Manfred Spraul wrote:

On 08/31/2016 06:40 PM, Will Deacon wrote:

The litmus test then looks a bit like:

CPUm:

LOCK(x)
smp_mb();
RyAcq=0

CPUn:

Wy=1
smp_mb();
UNLOCK_WAIT(x)

Correct.

which I think can be simplified to:

LOCK(x)

I thought that here a barrier is required, because Ry=0 could be before
store of the lock.

Ry=0

RyAcq instead of Ry would required due to the unlock at the end of the
critical section
CpuN: <...>
WyRelease=0
for the litmus test irrelevant.

Wy=1
smp_mb(); // Note that this is implied by spin_unlock_wait on PPC and arm64
LOCK(x) // spin_unlock_wait behaves like lock; unlock
UNLOCK(x)
[I've removed a bunch of barriers here, that I don't think are necessary
for the guarantees you're after]

and the question is "Can both CPUs proceed?".

Looking at the above, then I don't think that they can. Whilst CPUm can
indeed speculate the Ry=0 before successfully taking the lock, if CPUn
observes CPUm's read, then it must also observe the lock being held wrt
the spin_lock API. That is because a successful LOCK operation by CPUn
would force CPUm to replay its LL/SC loop and therefore discard its
speculation of y.

What am I missing? The code snippet seems to have too many barriers to me!

spin_unlock_wait() is not necessarily lock()+unlock().
It can be a simple Rx, or now RxAcq.

Can be, normally, yes. But on power and arm64, the only architectures on
which the ACQUIRE is 'funny' they do the 'pointless' ll/sc cycle in
spin_unlock_wait() to 'fix' things.

So for both power and arm64, you can in fact model spin_unlock_wait()
as LOCK+UNLOCK.

Is this consensus?

If I understand it right, the rules are:
1. spin_unlock_wait() must behave like spin_lock();spin_unlock();
2. spin_is_locked() must behave like spin_trylock() ? spin_unlock(),TRUE : FALSE
3. the ACQUIRE during spin_lock applies to the lock load, not to the store.

sem.c and nf_conntrack.c need only rule 1 now, but I would document the rest as well, ok?

I'll update the patches.

--
Manfred

Next message: Colin King: "[PATCH] [media] rc/streamzap: fix spelling mistake "sumbiting" -> "submitting""
Previous message: Baoyou Xie: "[PATCH] virtio: mark vring_dma_dev() static"
In reply to: Peter Zijlstra: "Re: [PATCH 1/4] spinlock: Document memory barrier rules"
Next in thread: Will Deacon: "Re: [PATCH 1/4] spinlock: Document memory barrier rules"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]