Re: [PATCH] locking/osq: Drop the overload of osq lock
From: Peter Zijlstra
Date: Sat Jun 25 2016 - 10:25:08 EST
On Sat, Jun 25, 2016 at 01:42:03PM -0400, Pan Xinhui wrote:
> An over-committed guest with more vCPUs than pCPUs has a heavy overload
> in osq_lock().
>
> This is because vCPU A hold the osq lock and yield out, vCPU B wait
> per_cpu node->locked to be set. IOW, vCPU B wait vCPU A to run and
> unlock the osq lock. Even there is need_resched(), it did not help on
> such scenario.
>
> To fix such bad issue, add a threshold in one while-loop of osq_lock().
> The value of threshold is somehow equal to SPIN_THRESHOLD.
Blergh, virt ...
So yes, lock holder preemption sucks. You would also want to limit the
immediate spin on owner.
Also; I really hate these random number spin-loop thresholds.
Is it at all possible to get feedback from your LPAR stuff that the vcpu
was preempted? Because at that point we can add do something like:
int vpc = vcpu_preempt_count();
...
for (;;) {
/* the big spin loop */
if (need_resched() || vpc != vcpu_preempt_count())
/* bail */
}
With a default implementation like:
static inline int vcpu_preempt_count(void)
{
return 0;
}
So the compiler can make it all go away.
But on virt muck it would stop spinning the moment the vcpu gets
preempted, which is the right moment I'm thinking.