Re: [PATCH 2/2] iommu/virtio: Add ops->flush_iotlb_all and enable deferred flush

From: Niklas Schnelle
Date: Wed Sep 06 2023 - 04:37:05 EST


On Mon, 2023-09-04 at 17:33 +0100, Robin Murphy wrote:
> On 2023-09-04 16:34, Jean-Philippe Brucker wrote:
> > On Fri, Aug 25, 2023 at 05:21:26PM +0200, Niklas Schnelle wrote:
> > > Add ops->flush_iotlb_all operation to enable virtio-iommu for the
> > > dma-iommu deferred flush scheme. This results inn a significant increase
> >
> > in
> >
> > > in performance in exchange for a window in which devices can still
> > > access previously IOMMU mapped memory. To get back to the prior behavior
> > > iommu.strict=1 may be set on the kernel command line.
> >
> > Maybe add that it depends on CONFIG_IOMMU_DEFAULT_DMA_{LAZY,STRICT} as
> > well, because I've seen kernel configs that enable either.
>
> Indeed, I'd be inclined phrase it in terms of the driver now actually
> being able to honour lazy mode when requested (which happens to be the
> default on x86), rather than as if it might be some
> potentially-unexpected change in behaviour.
>
> Thanks,
> Robin.

I kept running this series on a KVM guest on my private workstation
(QEMU v8.0.4) and while running iperf3 on a passed-through Intel 82599
VF. I got a bunch of IOMMU events similar to the following as well as
card resets in the host.

..
[ 5959.338214] vfio-pci 0000:04:10.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0037 address=0x7b657064 flags=0x0000]
[ 5963.353429] ixgbe 0000:03:00.0 enp3s0: Detected Tx Unit Hang
Tx Queue <0>
TDH, TDT <93>, <9d>
next_to_use <9d>
next_to_clean <93>
tx_buffer_info[next_to_clean]
time_stamp <10019e800>
jiffies <10019ec80>
...

I retested on v6.5 vanilla (guest & host) and still get the above
errors so luckily for me it doesn't seem to be caused by the new code
but I can't reproduce it without virtio-iommu. Any idea what could
cause this?


>
> > > Link: https://lore.kernel.org/lkml/20230802123612.GA6142@myrica/
> > > Signed-off-by: Niklas Schnelle <schnelle@xxxxxxxxxxxxx>
> > > ---
> > > drivers/iommu/virtio-iommu.c | 12 ++++++++++++
> > > 1 file changed, 12 insertions(+)
> > >
> > > diff --git a/drivers/iommu/virtio-iommu.c b/drivers/iommu/virtio-iommu.c
> > > index fb73dec5b953..1b7526494490 100644
> > > --- a/drivers/iommu/virtio-iommu.c
> > > +++ b/drivers/iommu/virtio-iommu.c
> > > @@ -924,6 +924,15 @@ static int viommu_iotlb_sync_map(struct iommu_domain *domain,
> > > return viommu_sync_req(vdomain->viommu);
> > > }
> > >
> > > +static void viommu_flush_iotlb_all(struct iommu_domain *domain)
> > > +{
> > > + struct viommu_domain *vdomain = to_viommu_domain(domain);
> > > +
> > > + if (!vdomain->nr_endpoints)
> > > + return;
> >
> > As for patch 1, a NULL check in viommu_sync_req() would allow dropping
> > this one
> >
> > Thanks,
> > Jean

Right, makes sense will move the check into viommu_sync_req() and add a
coment that it is there fore the cases where viommu_iotlb_sync() et al
get called before the IOMMU is set up.

> >
> > > + viommu_sync_req(vdomain->viommu);
> > > +}
> > > +
> > > static void viommu_get_resv_regions(struct device *dev, struct list_head *head)
> > > {
> > > struct iommu_resv_region *entry, *new_entry, *msi = NULL;
> > > @@ -1049,6 +1058,8 @@ static bool viommu_capable(struct device *dev, enum iommu_cap cap)
> > > switch (cap) {
> > > case IOMMU_CAP_CACHE_COHERENCY:
> > > return true;
> > > + case IOMMU_CAP_DEFERRED_FLUSH:
> > > + return true;
> > > default:
> > > return false;
> > > }
> > > @@ -1069,6 +1080,7 @@ static struct iommu_ops viommu_ops = {
> > > .map_pages = viommu_map_pages,
> > > .unmap_pages = viommu_unmap_pages,
> > > .iova_to_phys = viommu_iova_to_phys,
> > > + .flush_iotlb_all = viommu_flush_iotlb_all,
> > > .iotlb_sync = viommu_iotlb_sync,
> > > .iotlb_sync_map = viommu_iotlb_sync_map,
> > > .free = viommu_domain_free,
> > >
> > > --
> > > 2.39.2
> > >