Just been thinking about this, but I don't see an issue here becauseWith that in mind, it would probably be a good idea to eyeball the qspinlockIt seems plausible that the same thing could occur here in qspinlock:
slowpath as well, as that uses both atomic_cond_read_acquire() and
atomic_try_cmpxchg_relaxed().
if ((val & _Q_TAIL_MASK) == tail) {
if (atomic_try_cmpxchg_relaxed(&lock->val, &val, _Q_LOCKED_VAL))
goto release; /* No contention */
}
everybody is queuing the same way (i.e. we don't have a mechanism to jump
the queue like we do for qrwlock) and the tail portion of the lock word
isn't susceptible to ABA. That is, once we're at the head of the queue
and we've seen the lock become unlocked via atomic_cond_read_acquire(),
then we know we hold it.
So qspinlock looks fine to me, but I'd obviously value anybody else's
opinion on that.