Re: [PATCH v2 2/5] locking/pvqspinlock: Fix missed PV wakeup problem

From: Waiman Long
Date: Fri Jul 15 2016 - 16:22:09 EST


On 07/15/2016 06:07 AM, Peter Zijlstra wrote:
On Fri, Jul 15, 2016 at 05:39:46PM +0800, Pan Xinhui wrote:
I'm thinking you're trying to say this:


CPU0 CPU1 CPU2

__pv_queued_spin_unlock_slowpath()
...
smp_store_release(&l->locked, 0);
__pv_queued_spin_lock_slowpath()
...
pv_queued_spin_steal_lock()
cmpxchg(&l->locked, 0, _Q_LOCKED_VAL) == 0


pv_wait_head_or_lock()

pv_kick(node->cpu); ----------------------> pv_wait(&l->locked, _Q_SLOW_VAL);

__pv_queued_spin_unlock()
cmpxchg(&l->locked, _Q_LOCKED_VAL, 0) == _Q_LOCKED_VAL

for () {
trylock_clear_pending();
cpu_relax();
}

pv_wait(&l->locked, _Q_SLOW_VAL);


Which is indeed 'bad', but not fatal, note that the later pv_wait() will
not in fact go wait, since l->locked will _not_ be _Q_SLOW_VAL.
the problem is that "this later pv_wait will do nothing as l->locked
is not _Q_SLOW_VAL", So it is not paravirt friendly then. we will go
into the trylock loop again and again until the lock is unlocked.
Agreed, which is 'bad'. But the patch spoke about a missing wakeup,
which is worse, as that would completely inhibit progress.

Sorry, it is my mistake. There is no missing pv_wait().

So if we are kicked by the unlock_slowpath, and the lock is stealed by
someone else, we need hash its node again and set l->locked to
_Q_SLOW_VAL, then enter pv_wait.
Right, let me go think about this a bit.

Yes, the purpose of this patch is to do exactly that. Let's the queue head vCPU sleeps until the lock holder release the lock and wake the queue head vCPU up.


but I am worried about lock stealing. could the node in the queue
starve for a long time? I notice the latency of pv_wait on an
over-commited guest can be bigger than 300us. I have not seen such
starving case, but I think it is possible to happen.
I share that worry, which is why we limit the steal attempt to one.
But yes, theoretically its possible to starve things AFAICT.

We've not come up with sensible way to completely avoid starvation.

If you guys are worrying about lock constantly getting stolen between pv_kick() of queue head vCPU and it is ready to take the lock, we can keep the pending bit set across pv_wait() if it is the 2nd or later time that pv_wait() is called. That will ensure that no lock stealing can happen and cap the maximum wait time to about 2x (spin + pv_wait). I will add that patch to my patch series.

Cheers,
Longman