RE: [v4 08/16] KVM: kvm-vfio: User API for IRQ forwarding

From: Wu, Feng
Date: Thu Jun 25 2015 - 05:37:24 EST




> -----Original Message-----
> From: Joerg Roedel [mailto:joro@xxxxxxxxxx]
> Sent: Wednesday, June 24, 2015 11:46 PM
> To: Alex Williamson
> Cc: Wu, Feng; Eric Auger; Avi Kivity; kvm@xxxxxxxxxxxxxxx;
> linux-kernel@xxxxxxxxxxxxxxx; pbonzini@xxxxxxxxxx; mtosatti@xxxxxxxxxx
> Subject: Re: [v4 08/16] KVM: kvm-vfio: User API for IRQ forwarding
>
> On Thu, Jun 18, 2015 at 02:04:08PM -0600, Alex Williamson wrote:
> > There are plenty of details to be filled in,
>
> I also need to fill plenty of details in my head first, so here are some
> suggestions based on my current understanding. Please don't hesitate to
> correct me if where I got something wrong.
>
> So first I totally agree that the handling of PI/non-PI configurations
> should be transparent to user-space.

After thinking about this a bit more, I recall that why I used user-space
to trigger the IRTE update for posted-interrupts, here is the reason:

Let's take MSI for an example:
When guest updates the MSI configuration, here is the code path in
QEMU and KVM:

vfio_update_msi() --> vfio_update_kvm_msi_virq() -->
kvm_irqchip_update_msi_route() --> kvm_update_routing_entry() -->
kvm_irqchip_commit_routes() --> kvm_irqchip_commit_routes() -->
KVM_SET_GSI_ROUTING --> kvm_set_irq_routing()

It will finally go to kvm_set_irq_routing() in KVM, there are two problem:
1. It use RCU in this function, it is hard to find which entry in the irq routing
table is being updated.
2. Even we find the updated entry, it is hard to find the associated assigned
device with this irq routing entry.

So I used a VFIO API to notify KVM the updated MSI/MSIx configuration and
the associated assigned devices. I think we need to find a way to address
the above two issues before going forward. Alex, what is your opinion?
Thanks a lot!

Thanks,
Feng


>
> I read a bit through the VT-d spec, and my understanding of posted
> interrupts so far is that:
>
> 1) Each VCPU gets a PI-Descriptor with its pending Posted
> Interrupts. This descriptor needs to be updated when a VCPU
> is migrated to another PCPU and should thus be under control
> of KVM.
>
> This is similar to the vAPIC backing page in the AMD version
> of this, except that the PCPU routing information is stored
> somewhere else on AMD.
>
> 2) As long as the VCPU runs the IRTEs are configured for
> posting, when the VCPU goes to sleep the old remapped entry is
> established again. So when the VCPU sleeps the interrupt
> would get routed to VFIO and forwarded through the eventfd.
>
> This would be different to the AMD version, where we have a
> running bit. When this is clear the IOMMU will trigger an event
> in its event-log. This might need special handling in VFIO
> ('might' because VFIO does not need to forward the interrupt,
> it just needs to make sure the VCPU wakes up).
>
> Please correct me if my understanding of the Intel version is
> wrong.
>
> So most of the data structures the IOMMU reads for this need to be
> updated from KVM code (either x86-generic or AMD/Intel specific code),
> as KVM has the information about VCPU load/unload and the IRQ routing.
>
> What KVM needs from VFIO are the informations about the physical
> interrupts, and it makes total sense to attach them as metadata to the
> eventfd.
>
> But the problems start at how this metadata should look like. It would
> be good to have some generic description, but not sure if this is
> possible. Otherwise this metadata would need to be requested by VFIO
> from the IOMMU driver and passed on to KVM, which it then passes back to
> the IOMMU driver. Or something like that.
>
>
>
> Joerg

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/