On Tue, May 10, 2022 at 11:47 AM Waiman Long <longman@xxxxxxxxxx> wrote:>
Qspinlock still has one head waiter spinning on the lock. This is muchOh, absolutely. I'm not saying we should look at going back. I'm more
better than the original ticket spinlock where there will be n waiters
spinning on the lock.
asking whether maybe we could go even further..
That is the cost of a cheap unlock. There is no way to eliminate allSo there's no question that unlock would be more expensive for the
lock spinning unless we use MCS lock directly which will require a
change in locking API as well as more expensive unlock.
contention case, since it would have to always not only clear the lock
itself, as well as update the noce it points to.
But does it actually require a change in the locking API?
At the minimum, we need to do a read to determine if the lock is contended and also a write to clear the lock bit. An atomic write after read is expensive. There is also a possibility of race with non-atomic read-write. Moreover, the tail stored in the lock can tell you the queue tail, not its current head. So there is no easy way to notify the current head waiter to acquire the lock. The only thing I can think of is to do some kind of proportional backoff for the head waiter by slowly increasing the time gap with successive spinning attempts on the lock over time with the cost of an increased lock acquisition latency.
The qspinlock slowpath already always allocates that mcs node (for
some definition of "always" - I am obviously ignoring all the trylock
cases both before and in the slowpath)
But yes, clearly the simply store-release of the current
queued_spin_unlock() wouldn't work as-is, and maybe the cost of
replacing it with something else is much more expensive than any
possible win.
I think the PV case already basically does that - replacing the the
"store release" with a much more complex sequence. No?