Re: [PATCH v7 06/11] x86, paravirt: Add interface to support kvm/xen vcpu preempted check
From: Peter Zijlstra
Date: Wed Nov 16 2016 - 06:43:50 EST
On Wed, Nov 16, 2016 at 12:29:44PM +0100, Christian Borntraeger wrote:
> On 11/16/2016 11:23 AM, Peter Zijlstra wrote:
> > On Wed, Nov 16, 2016 at 12:19:09PM +0800, Pan Xinhui wrote:
> >> Hi, Peter.
> >> I think we can avoid a function call in a simpler way. How about below
> >>
> >> static inline bool vcpu_is_preempted(int cpu)
> >> {
> >> /* only set in pv case*/
> >> if (pv_lock_ops.vcpu_is_preempted)
> >> return pv_lock_ops.vcpu_is_preempted(cpu);
> >> return false;
> >> }
> >
> > That is still more expensive. It needs to do an actual load and makes it
> > hard to predict the branch, you'd have to actually wait for the load to
> > complete etc.
>
> Out of curiosity, why is that hard to predict?
> On s390 the branch prediction runs asynchronously ahead of the downstream
> pipeline (e.g. search for "IBM z Systems Processor Optimization Primer" page 11).
> given enough capacity, I would assume that modern x86 processors would do the same
> and be able to predict this is as soon as it becomes hot (and otherwise you would
> not notice the branch miss anyway). Is x86 behaving differently here?
Not sure how exactly it works, but it seems to me that an immediate
assignment to the value you're going to compare would leave very little
doubt.
Then again, maybe cores aren't that smart and only look at the
hysterical btb for prediction.
> > Also, it generates more code.
> >
> > Paravirt muck should strive to be as cheap as possible when ran on
> > native hardware.
>
> As I am interested in this series from the s390 point of view, this is
> the only thing that block this series?
Ingo was rewriting the changelog, other than that, no, I can do this on
top. Just spotted this because Ingo and me talked it over.
> Is there a chance to add a static key around the paravirt ops somehow?
More code generation still, replacing the call with an immediate
assignment to the return register is the shortest possible option I
think.