Re: [PATCH 2/8] iommu/vt-d: Add entry_sync support for PASID entry updates

From: Baolu Lu

Date: Thu Mar 12 2026 - 03:53:29 EST

On 3/9/26 21:41, Jason Gunthorpe wrote:

+static void intel_pasid_sync(struct entry_sync_writer128 *writer)
+{
+ struct intel_pasid_writer *p_writer = container_of(writer,
+ struct intel_pasid_writer, writer);
+ struct intel_iommu *iommu = p_writer->iommu;
+ struct device *dev = p_writer->dev;
+ bool was_present, is_present;
+ u32 pasid = p_writer->pasid;
+ struct pasid_entry *pte;
+ u16 old_did, old_pgtt;
+
+ pte = intel_pasid_get_entry(dev, pasid);
+ was_present = p_writer->was_present;
+ is_present = pasid_pte_is_present(pte);
+ old_did = pasid_get_domain_id(&p_writer->orig_pte);
+ old_pgtt = pasid_pte_get_pgtt(&p_writer->orig_pte);
+
+ /* Update the last present state: */
+ p_writer->was_present = is_present;
+
+ if (!ecap_coherent(iommu->ecap))
+ clflush_cache_range(pte, sizeof(*pte));
+
+ /* Sync for "P=0" to "P=1": */
+ if (!was_present) {
+ if (is_present)
+ pasid_flush_caches(iommu, pte, pasid,
+ pasid_get_domain_id(pte));
+
+ return;
+ }
+
+ /* Sync for "P=1" to "P=1": */
+ if (is_present) {
+ intel_pasid_flush_present(iommu, dev, pasid, old_did, pte);
+ return;
+ }
+
+ /* Sync for "P=1" to "P=0": */
+ pasid_cache_invalidation_with_pasid(iommu, old_did, pasid);

Why all this logic? All this different stuff does is meddle with the
IOTLB and it should not seen below.

If the sync is called it should just always call
pasid_cache_invalidation_with_pasid(), that's it.

Writer has already eliminated all cases where sync isn't needed.

You're right. The library should simplify things. I will remove the
state tracking. The callback will only ensure that memory is flushed
(for non-coherent mode) and the relevant PASID cache is invalidated.

+ if (old_pgtt == PASID_ENTRY_PGTT_PT || old_pgtt == PASID_ENTRY_PGTT_FL_ONLY)
+ qi_flush_piotlb(iommu, old_did, pasid, 0, -1, 0);
+ else
+ iommu->flush.flush_iotlb(iommu, old_did, 0, 0, DMA_TLB_DSI_FLUSH);
+ devtlb_invalidation_with_pasid(iommu, dev, pasid);

The IOTLB should already be clean'd before the new entry using the
cache tag is programmed. Cleaning it after the entry is live is buggy.

> > The writer logic ensures it never sees a corrupted entry, so the clean

cache tag cannot be mangled during the writing process.

The way ARM is structured has the cache tags clean if they are in the
allocator bitmap, so when the driver fetches a new tag and starts
using it is clean and non cleaning is needed

When it frees a tag it cleans it and then returns it to the allocator.

If I understand your remark correctly, the driver should only need the
following in the sync callback:

- clflush (if non-coherent) to ensure the entry is in physical memory.
- PASID cache invalidation to force the hardware to re-read the entry.
- Device-TLB invalidation to drop local device caches.

Does that sound right? I can move the general IOTLB/PIOTLB invalidation
logic to the domain detach/free paths.

ATC invalidations should always be done after the PASID entry is
written. During a hitless update both translations are unpredictably
combined, this is unavoidable and OK.

The VT-d spec (Sections 6.5.2.5 and 6.5.2.6) explicitly mandates that an
IOTLB invalidation must precede the Device-TLB invalidation. If we only
do the device-TLB invalidation in the sync callback, we risk the device
re-fetching a stale translation from the IOMMU's internal IOTLB.

Thanks,
baolu