Re: [PATCH 2/2] iommu/vt-d: Remove caching mode check before devtlb flush

From: Baolu Lu
Date: Thu Apr 11 2024 - 08:15:53 EST


On 2024/4/11 7:49, Zhang, Tina wrote:
Hi,

-----Original Message-----
From: Tian, Kevin <kevin.tian@xxxxxxxxx>
Sent: Tuesday, April 9, 2024 3:30 PM
To: Lu Baolu <baolu.lu@xxxxxxxxxxxxxxx>; iommu@xxxxxxxxxxxxxxx
Cc: Liu, Yi L <yi.l.liu@xxxxxxxxx>; Joerg Roedel <joro@xxxxxxxxxx>; Will
Deacon <will@xxxxxxxxxx>; Robin Murphy <robin.murphy@xxxxxxx>; linux-
kernel@xxxxxxxxxxxxxxx
Subject: RE: [PATCH 2/2] iommu/vt-d: Remove caching mode check before
devtlb flush

From: Lu Baolu <baolu.lu@xxxxxxxxxxxxxxx>
Sent: Sunday, April 7, 2024 10:43 PM

The Caching Mode (CM) of the Intel IOMMU indicates if the hardware
implementation caches not-present or erroneous translation-structure
entries except the first-stage translation. The caching mode is
unrelated to the device TLB , therefore there is no need to check it
before a device TLB invalidation operation.

Before the scalable mode is introduced, caching mode is treated as an
indication that the driver is running in a VM guest. This is just a
software contract as shadow page table is the only way to implement a
virtual IOMMU. But the VT-d spec doesn't state this anywhere. After
the scalable mode is introduced, this doesn't stand for anymore, as
caching mode is not relevant for the first-stage translation. A
virtual IOMMU implementation is free to support first-stage
translation only with caching mode cleared.

I didn't get the connection to the scalable mode.

if required we can still use caching mode to imply running as a guest.
Just need to make sure its implementation conforming to the VT-d spec.


Remove the caching mode check before device TLB invalidation to ensure
compatibility with the scalable mode use cases.

Fixes: 792fb43ce2c9 ("iommu/vt-d: Enable Intel IOMMU scalable mode by
default")
Signed-off-by: Lu Baolu <baolu.lu@xxxxxxxxxxxxxxx>
---
drivers/iommu/intel/iommu.c | 5 ++---
1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
index 493b6a600394..681789b1258d 100644
--- a/drivers/iommu/intel/iommu.c
+++ b/drivers/iommu/intel/iommu.c
@@ -1501,7 +1501,7 @@ static void iommu_flush_iotlb_psi(struct
intel_iommu *iommu,
else
__iommu_flush_iotlb_psi(iommu, did, pfn, pages, ih);

- if (!cap_caching_mode(iommu->cap) && !map)
+ if (!map)
iommu_flush_dev_iotlb(domain, addr, mask);

as commented earlier better squash this in patch1.

}

@@ -1575,8 +1575,7 @@ static void intel_flush_iotlb_all(struct
iommu_domain *domain)
iommu->flush.flush_iotlb(iommu, did, 0, 0,
DMA_TLB_DSI_FLUSH);

- if (!cap_caching_mode(iommu->cap))
- iommu_flush_dev_iotlb(dmar_domain, 0,
MAX_AGAW_PFN_WIDTH);
+ iommu_flush_dev_iotlb(dmar_domain, 0,
MAX_AGAW_PFN_WIDTH);
}


I'm hesitating to agree with this change. Strictly speaking it's correct.
but w/o supporting batch invalidation this implies performance drop on
viommu due to more VM-exits and there may incur user complaints when
their VMs upgrade to a newer kernel version.

So it'd be better to keep this behavior and fix it together with batch
invalidation support. Anyway none of the viommu implementations today
(either shadow or nested translation) relies on the correct devtlb behavior
from the guest otherwise it's already broken.
How about we split this change into a patch. I'm working on the batch invalidation patch-set now and I'm happy to include this code change into the batch invalidation series.

No worries. It turned out that these two checks are just unnecessary.
Removing them won't cause any driver behavior change. So, I will make it
a cleanup patch and target it for the next merge window.

Best regards,
baolu