Correct.
I'm struggling with this example. We have these locks:
&sem->lock
&sma->sem_base[0...sma->sem_nsems].lock
&sma->sem_perm.lock
a condition variable:
sma->complex_mode
and a new barrier:
smp_mb__after_spin_lock()
For simplicity, we can make sma->sem_nsems == 1, and have &sma->sem_base[0]
be &sem->lock in the example above.
&sma->sem_perm.lock seems to beCorrect.
irrelevant.
The litmus test then looks a bit like:Correct.
CPUm:
LOCK(x)
smp_mb();
RyAcq=0
CPUn:
Wy=1
smp_mb();
UNLOCK_WAIT(x)
I thought that here a barrier is required, because Ry=0 could be before store of the lock.
which I think can be simplified to:
LOCK(x)
Ry=0RyAcq instead of Ry would required due to the unlock at the end of the critical section
Wy=1
smp_mb(); // Note that this is implied by spin_unlock_wait on PPC and arm64
LOCK(x) // spin_unlock_wait behaves like lock; unlock
UNLOCK(x)
[I've removed a bunch of barriers here, that I don't think are necessaryspin_unlock_wait() is not necessarily lock()+unlock().
for the guarantees you're after]
and the question is "Can both CPUs proceed?".
Looking at the above, then I don't think that they can. Whilst CPUm can
indeed speculate the Ry=0 before successfully taking the lock, if CPUn
observes CPUm's read, then it must also observe the lock being held wrt
the spin_lock API. That is because a successful LOCK operation by CPUn
would force CPUm to replay its LL/SC loop and therefore discard its
speculation of y.
What am I missing? The code snippet seems to have too many barriers to me!