Re: [PATCH v3 3/3] vfio_pci: make use of update_irq_devid and optimize irq ops

From: luoben
Date: Thu Aug 22 2019 - 09:46:26 EST

å 2019/8/20 äå11:27, Liu, Jiang åé:

On Aug 20, 2019, at 11:24 PM, luoben <luoben@xxxxxxxxxxxxxxxxx> wrote:

å 2019/8/16 äå12:45, Thomas Gleixner åé:
On Thu, 15 Aug 2019, Ben Luo wrote:

if (vdev->ctx[vector].trigger) {
- free_irq(irq, vdev->ctx[vector].trigger);
- irq_bypass_unregister_producer(&vdev->ctx[vector].producer);
- kfree(vdev->ctx[vector].name);
- eventfd_ctx_put(vdev->ctx[vector].trigger);
- vdev->ctx[vector].trigger = NULL;
+ if (vdev->ctx[vector].trigger == trigger) {
+ irq_bypass_unregister_producer(&vdev->ctx[vector].producer);

What's the point of unregistering the producer, just to register it again below?
Whether producer token changed or not, irq_bypass_register_producer() is a way (seems the

only way) to update IRTE by VFIO for posted interrupt.

When posted interrupt is in use, the target IRTE will be used by IOMMU directly to find the

target CPU for an interrupt posted to VM, since hypervisor is bypassed.

irq_bypass_register_producer() will modify IRTE based on the information retrieved from KVM,

0xffffffff8150a920 : modify_irte+0x0/0x180 [kernel]
0xffffffff8150ab94 : intel_ir_set_vcpu_affinity+0xf4/0x150 [kernel]
0xffffffff81125f3c : irq_set_vcpu_affinity+0x5c/0xa0 [kernel]
0xffffffffa0550818 : vmx_update_pi_irte+0x178/0x290 [kvm_intel] // get pi_desc etc. from KVM
0xffffffffa052b789 : kvm_arch_irq_bypass_add_producer+0x39/0x50 [kvm_intel] (inexact)
0xffffffffa024a50b : __connect+0x7b/0xa0 [kvm] (inexact)
0xffffffffa024a6a8 : irq_bypass_register_producer+0x108/0x140 [kvm] (inexact)
0xffffffffa0338386 : vfio_msi_set_vector_signal+0x1b6/0x2c0 [vfio_pci] (inexact)
+ } else if (trigger) {
+ ret = update_irq_devid(irq,
+ vdev->ctx[vector].trigger, trigger);
+ if (unlikely(ret)) {
+ dev_info(&pdev->dev,
+ "update_irq_devid %d (token %p) fails: %d\n",


+ irq, vdev->ctx[vector].trigger, ret);
+ eventfd_ctx_put(trigger);
+ return ret;
+ }

This lacks any form of comment why this is correct. dev_id is updated and
the producer with the old token is still registered.

ok, I will add comment like below:

For KVM-VFIO case, once producer and consumer connected successfully, interrupt from passthrough

device will be directly delivered to VM instead of triggering interrupt process in HOST. If producer and

consumer are disconnected, this interrupt process will fall back to remap mode, the handler vfio_msihandler()

registered in corresponding irqaction will use the dev_id to find the right way to deliver the interrupt to VM.

So, it is safe to update dev_id before unregistration of irq-bypass producer, i.e. switch back from posted

mode to remap mode, since only in remap mode the 'dev_id' will be used by interrupt handler. To producer

and consumer, dev_id is only a token for pairing them togather.
+ irq_bypass_unregister_producer(&vdev->ctx[vector].producer);

Now it's unregistered.

+ eventfd_ctx_put(vdev->ctx[vector].trigger);
+ vdev->ctx[vector].producer.token = trigger;

The token is updated and then it's newly registered below.

+ vdev->ctx[vector].trigger = trigger;
- vdev->ctx[vector].producer.token = trigger;
- vdev->ctx[vector].producer.irq = irq;
+ /* always update irte for posted mode */
ret = irq_bypass_register_producer(&vdev->ctx[vector].producer);
if (unlikely(ret))
"irq bypass producer (token %p) registration fails: %d\n",
vdev->ctx[vector].producer.token, ret);

I know this code already existed, but again this looks inconsistent. If the
registration fails then a subsequent update will try to unregister a not
registered producer. Does not make any sense to me.

irq_bypass_register_producer() also fails on duplicated registration, so maybe an unconditional try to

unregistration is a easy way to keep consistent.

Maybe it's better to change these function names to irq_bypass_try_register_producer() and

irq_bypass_try_unregister_producer() :)

Hi Ben,
The point is that we shouldnât blindly try to register again if we fails to unregister a posted interrupt producer. By this way, the fast path (posted interrupt) may get disabled, but itâs safer than blindly ignoring the failure of ungistration.

This may need another patchset to enhance the irq_bypass register/unregister functions first, at least to return some error code when irq_bypass_try_unregister_producer() fails. I would like to have a try later.