RE: [patch 21/32] NTB/msi: Convert to msi_on_each_desc()
From: Tian, Kevin
Date: Thu Sep 15 2022 - 05:24:18 EST
Hi, Thomas,
> From: Thomas Gleixner <tglx@xxxxxxxxxxxxx>
> Sent: Monday, December 13, 2021 4:56 AM
>
> Kevin,
>
> On Sun, Dec 12 2021 at 01:56, Kevin Tian wrote:
> >> From: Thomas Gleixner <tglx@xxxxxxxxxxxxx>
> >> All I can find is drivers/iommu/virtio-iommu.c but I can't find anything
> >> vIR related there.
> >
> > Well, virtio-iommu is a para-virtualized vIOMMU implementations.
> >
> > In reality there are also fully emulated vIOMMU implementations (e.g.
> > Qemu fully emulates Intel/AMD/ARM IOMMUs). In those configurations
> > the IR logic in existing iommu drivers just apply:
> >
> > drivers/iommu/intel/irq_remapping.c
> > drivers/iommu/amd/iommu.c
>
> thanks for the explanation. So that's a full IOMMU emulation. I was more
> expecting a paravirtualized lightweight one.
>
Resume this old thread as I realized this open was not closed after discussing
with Reinette who will pick up this work.
In practice emulated IOMMUs are still widely used and many new features
(pasid, sva, etc.) will come to them first instead of on virtio-iommu. So it'd
be good to make the new scheme working on emulated IOMMUs too.
Following this direction probably one feasible option is to introduce certain
PV facility in the spec of emulated IOMMUs as a contract to differentiate
from their existing vIR logic, if we don't want to go back to do heuristics.
Intel-IOMMU already defines a VCMD interface in the spec, in particular for
exchanging information between host/guest. It can be easily extended to
indicate and allow exchanging interrupt addr/data pair between host/guest.
Does it sound a right direction for other IOMMU vendors to follow if they
want to benefit from the new scheme instead of using virtio-iommu?
And with that we don't need define CPU/VMM specific hypercalls. Just
rely on vIOMMUs to claim the capability and enable the new vIR scheme.
--
btw another open is about VM live migration.
After migration the IRTE index could change hence the addr/data pair
acquired before migration becomes stale and must be fixed.
and stale content is both programmed to the interrupt storage and cached
in msi_desc.
The host migration driver may be able to help fix the addr/data in MMIO-based
IMS when restoring the device state, but not memory-based IMS and guest
cached content.
This kindly requires guest cooperation to get it right then, e.g. notify the guest
mark previously acquired addr/data as stale before stopping the VM on src
and then notify it to reacquire add/data after resuming the VM on dest.
But I'm not sure how to handle interrupt lost in the window between resuming
the VM and notifying it to reacquire...
Thanks
Kevin