Re: [PATCH 2/3] kvm: x86: do not use KVM_REQ_EVENT for APICv interrupt injection

From: Michael S. Tsirkin
Date: Wed Sep 28 2016 - 09:46:23 EST


On Wed, Sep 28, 2016 at 10:21:41AM +0200, Paolo Bonzini wrote:
>
>
> On 28/09/2016 01:07, Michael S. Tsirkin wrote:
> > On Tue, Sep 27, 2016 at 11:20:12PM +0200, Paolo Bonzini wrote:
> >> Since bf9f6ac8d749 ("KVM: Update Posted-Interrupts Descriptor when vCPU
> >> is blocked", 2015-09-18) the posted interrupt descriptor is checked
> >> unconditionally for PIR.ON. Therefore we don't need KVM_REQ_EVENT to
> >> trigger the scan and, if NMIs or SMIs are not involved, we can avoid
> >> the complicated event injection path.
> >>
> >> However, there is a race between vmx_deliver_posted_interrupt and
> >> vcpu_enter_guest. Fix it by disabling interrupts before vcpu->mode is
> >> set to IN_GUEST_MODE.
> >
> > Could you describe the race a bit more please?
> > I'm surprised that locally disabling irqs
> > fixes a race with a different CPUs.
>
> The posted interrupt IPI has a dummy handler in arch/x86/kernel/irq.c
> (smp_kvm_posted_intr_ipi). It only does something if it is received
> while the guest is running.
>
> So local_irq_disable has an interesting side effect. Because the
> interrupt will not be processed until the guest is entered,
> local_irq_disable effectively switches the IRQ handler from the dummy
> handler to the processor's posted interrupt handling.
>
> So you want to do that before setting IN_GUEST_MODE, otherwise the IPI
> sent by deliver_posted_interrupt is ignored.
>
> However, the patch is wrong, because this bit:
>
> if (kvm_lapic_enabled(vcpu)) {
> /*
> * Update architecture specific hints for APIC
> * virtual interrupt delivery.
> */
> if (vcpu->arch.apicv_active)
> kvm_x86_ops->hwapic_irr_update(vcpu,
> kvm_lapic_find_highest_irr(vcpu));
> }
>
> also has to be moved after setting IN_GUEST_MODE. Basically the order
> for interrupt injection is:
>
> (1) set PIR
> smp_wmb()

Empty on x86 btw but good for reasoning about barrier
pairing. So - where's the paired smp_rmb? See below.

> (2) set ON
> smp_mb()

This one can be combined to smp_store_mb to save
a couple of cycles.

> (3) read vcpu->mode
> if IN_GUEST_MODE
> (4a) send posted interrupt IPI
> else
> (4b) kick (i.e. cmpxchg vcpu->mode from IN_GUEST_MODE to
> EXITING_GUEST_MODE and send reschedule IPI)
>
> while the order for entering the guest must be the opposite. The
> numbers on the left identify the pairing between interrupt injection and
> vcpu_entr_guest
>
> (4a) enable posted interrupt processing (i.e. disable interrupts!)
> (3) set vcpu->mode to IN_GUEST_MODE
> smp_mb()

This one can be combined to smp_store_mb to save
a couple of cycles.



> (2) read ON
> if ON then

do we need smp_rmb here?

> (1) read PIR
> sync PIR to IRR
> (4b) read vcpu->mode
> if vcpu->mode == EXITING_GUEST_MODE then
> cancel vmentry
> (3/2/1) # posted interrupts are processed on the next vmentry
>
> Paolo