[RFC][PATCH 0/3] locking/qspinlock: Improve determinism for x86
From: Peter Zijlstra
Date: Wed Sep 26 2018 - 07:30:03 EST
Back when Will did his qspinlock determinism patches, we were left with one
cmpxchg loop on x86 due to the use of atomic_fetch_or(). Will proposed a nifty
trick:
http://lkml.kernel.org/r/20180409145409.GA9661@xxxxxxx
But at the time we didn't pursue it. This series implements that and argues for
its correctness. In particular it places an smp_mb__after_atomic() in
between the two operations, which forces the load to come after the
store (which is free on x86 anyway).
In particular this ordering ensures a concurrent unlock cannot trigger
the uncontended handoff. Also it ensures that if the xchg() happens
after a (successful) trylock, we must observe that LOCKED bit.