Re: [PATCH 2/2] iommu/vt-d: Remove caching mode check before devtlb flush

From: Baolu Lu
Date: Mon Apr 08 2024 - 23:13:44 EST


On 4/9/24 5:03 AM, Jacob Pan wrote:
Hi Lu,

Hi Jacob,


On Sun, 7 Apr 2024 22:42:32 +0800, Lu Baolu<baolu.lu@xxxxxxxxxxxxxxx>
wrote:

The Caching Mode (CM) of the Intel IOMMU indicates if the hardware
implementation caches not-present or erroneous translation-structure
entries except the first-stage translation. The caching mode is
unrelated to the device TLB , therefore there is no need to check
it before a device TLB invalidation operation.

Before the scalable mode is introduced, caching mode is treated as
an indication that the driver is running in a VM guest. This is just
a software contract as shadow page table is the only way to implement
a virtual IOMMU. But the VT-d spec doesn't state this anywhere. After
the scalable mode is introduced, this doesn't stand for anymore, as
caching mode is not relevant for the first-stage translation. A virtual
IOMMU implementation is free to support first-stage translation only
with caching mode cleared.

Remove the caching mode check before device TLB invalidation to ensure
compatibility with the scalable mode use cases.

I agree with the changes below, what about this CM check:

/* Notification for newly created mappings */
static void __mapping_notify_one(struct intel_iommu *iommu, struct dmar_domain *domain,
unsigned long pfn, unsigned int pages)
{
/*
* It's a non-present to present mapping. Only flush if caching mode
* and second level.
*/
if (cap_caching_mode(iommu->cap) && !domain->use_first_level)
iommu_flush_iotlb_psi(iommu, domain, pfn, pages, 0, 1);

We are still tying devTLB flush to CM=1, no?

__mapping_notify_one() is called in the path where some PTEs are changed
from non-present to present.

In this scenario,

- if CM is set and first-stage translation is not used, the IOTLB caches
are required to be explicitly flushed.
- else if hardware requires write buffer flushing, do it.
- Otherwise, no op.
- devtlb invalidation is irrelevant to this path.

The code after the fix appears to do the right thing. devTLB is not
invalidated in iommu_flush_iotlb_psi() since it's a map (map == 1).

Or perhaps I overlooked anything?


If we are running in the guest with second level page table (shadowed), can
we decide if devTLB flush is needed based on ATS enable just as the rest of
the cases?

I think the ATS check should be consistent. It's generic no matter how
the IOMMU is implemented (in hardware or emulated in software).

Best regards,
baolu