2014-05-07 11:01-0400, Waiman Long:
From: Peter Zijlstra<peterz@xxxxxxxxxxxxx>I think there is an unwanted scenario on virtual machines:
Because the qspinlock needs to touch a second cacheline; add a pending
bit and allow a single in-word spinner before we punt to the second
cacheline.
1) VCPU sets the pending bit and start spinning.
2) Pending VCPU gets descheduled.
- we have PLE and lock holder isn't running [1]
- the hypervisor randomly preempts us
3) Lock holder unlocks while pending VCPU is waiting in queue.
4) Subsequent lockers will see free lock with set pending bit and will
loop in trylock's 'for (;;)'
- the worst-case is lock starving [2]
- PLE can save us from wasting whole timeslice
Retry threshold is the easiest solution, regardless of its ugliness [4].
Another minor design flaw is that formerly first VCPU gets appended to
the tail when it decides to queue;
is the performance gain worth it?
Thanks.