RE: [PATCH 2/2] iommu/vt-d: Remove caching mode check before devtlb flush

From: Zhang, Tina
Date: Wed Apr 10 2024 - 19:49:24 EST


Hi,

> -----Original Message-----
> From: Tian, Kevin <kevin.tian@xxxxxxxxx>
> Sent: Tuesday, April 9, 2024 3:30 PM
> To: Lu Baolu <baolu.lu@xxxxxxxxxxxxxxx>; iommu@xxxxxxxxxxxxxxx
> Cc: Liu, Yi L <yi.l.liu@xxxxxxxxx>; Joerg Roedel <joro@xxxxxxxxxx>; Will
> Deacon <will@xxxxxxxxxx>; Robin Murphy <robin.murphy@xxxxxxx>; linux-
> kernel@xxxxxxxxxxxxxxx
> Subject: RE: [PATCH 2/2] iommu/vt-d: Remove caching mode check before
> devtlb flush
>
> > From: Lu Baolu <baolu.lu@xxxxxxxxxxxxxxx>
> > Sent: Sunday, April 7, 2024 10:43 PM
> >
> > The Caching Mode (CM) of the Intel IOMMU indicates if the hardware
> > implementation caches not-present or erroneous translation-structure
> > entries except the first-stage translation. The caching mode is
> > unrelated to the device TLB , therefore there is no need to check it
> > before a device TLB invalidation operation.
> >
> > Before the scalable mode is introduced, caching mode is treated as an
> > indication that the driver is running in a VM guest. This is just a
> > software contract as shadow page table is the only way to implement a
> > virtual IOMMU. But the VT-d spec doesn't state this anywhere. After
> > the scalable mode is introduced, this doesn't stand for anymore, as
> > caching mode is not relevant for the first-stage translation. A
> > virtual IOMMU implementation is free to support first-stage
> > translation only with caching mode cleared.
>
> I didn't get the connection to the scalable mode.
>
> if required we can still use caching mode to imply running as a guest.
> Just need to make sure its implementation conforming to the VT-d spec.
>
> >
> > Remove the caching mode check before device TLB invalidation to ensure
> > compatibility with the scalable mode use cases.
> >
> > Fixes: 792fb43ce2c9 ("iommu/vt-d: Enable Intel IOMMU scalable mode by
> > default")
> > Signed-off-by: Lu Baolu <baolu.lu@xxxxxxxxxxxxxxx>
> > ---
> > drivers/iommu/intel/iommu.c | 5 ++---
> > 1 file changed, 2 insertions(+), 3 deletions(-)
> >
> > diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
> > index 493b6a600394..681789b1258d 100644
> > --- a/drivers/iommu/intel/iommu.c
> > +++ b/drivers/iommu/intel/iommu.c
> > @@ -1501,7 +1501,7 @@ static void iommu_flush_iotlb_psi(struct
> > intel_iommu *iommu,
> > else
> > __iommu_flush_iotlb_psi(iommu, did, pfn, pages, ih);
> >
> > - if (!cap_caching_mode(iommu->cap) && !map)
> > + if (!map)
> > iommu_flush_dev_iotlb(domain, addr, mask);
>
> as commented earlier better squash this in patch1.
>
> > }
> >
> > @@ -1575,8 +1575,7 @@ static void intel_flush_iotlb_all(struct
> > iommu_domain *domain)
> > iommu->flush.flush_iotlb(iommu, did, 0, 0,
> > DMA_TLB_DSI_FLUSH);
> >
> > - if (!cap_caching_mode(iommu->cap))
> > - iommu_flush_dev_iotlb(dmar_domain, 0,
> > MAX_AGAW_PFN_WIDTH);
> > + iommu_flush_dev_iotlb(dmar_domain, 0,
> > MAX_AGAW_PFN_WIDTH);
> > }
> >
>
> I'm hesitating to agree with this change. Strictly speaking it's correct.
> but w/o supporting batch invalidation this implies performance drop on
> viommu due to more VM-exits and there may incur user complaints when
> their VMs upgrade to a newer kernel version.
>
> So it'd be better to keep this behavior and fix it together with batch
> invalidation support. Anyway none of the viommu implementations today
> (either shadow or nested translation) relies on the correct devtlb behavior
> from the guest otherwise it's already broken.
How about we split this change into a patch. I'm working on the batch invalidation patch-set now and I'm happy to include this code change into the batch invalidation series.

Regards,
-Tina