Re: [PATCH v2 11/16] iommu/vt-d: preserve PASID table of preserved device

From: Samiullah Khawaja

Date: Mon May 11 2026 - 14:45:47 EST


On Fri, May 08, 2026 at 02:05:58PM +0800, Baolu Lu wrote:
On 4/28/26 01:56, Samiullah Khawaja wrote:
In scalable mode the PASID table is used to fetch the io page tables.
Preserve and restore the PASID table of the preserved devices.

Signed-off-by: Samiullah Khawaja<skhawaja@xxxxxxxxxx>
---
drivers/iommu/intel/iommu.c | 5 +-
drivers/iommu/intel/iommu.h | 12 +++
drivers/iommu/intel/liveupdate.c | 141 +++++++++++++++++++++++++++++++
drivers/iommu/intel/pasid.c | 7 +-
drivers/iommu/intel/pasid.h | 9 ++
include/linux/kho/abi/iommu.h | 13 +++
6 files changed, 184 insertions(+), 3 deletions(-)


[snip]
+
+void pasid_cleanup_preserved_table(struct device *dev)
+{
+ struct pasid_table *pasid_table;
+ struct pasid_dir_entry *dir;
+ struct pasid_entry *table;
+ size_t dir_size;
+
+ pasid_table = intel_pasid_get_table(dev);
+ if (!pasid_table)
+ return;
+
+ dir = pasid_table->table;
+ table = get_pasid_table_from_pde(&dir[0]);
+ if (!table)
+ return;
+
+ /* Clear everything except the first entry in table. */
+ memset(&table[1], 0, SZ_4K - sizeof(*table));
+
+ /* Use the folio order to calculate the size of Pasid Directory */
+ dir_size = (1 << (folio_order(virt_to_folio(dir)) + PAGE_SHIFT));
+
+ /* Clear everything except the first entry in directory */
+ memset(&dir[1], 0, dir_size - sizeof(struct pasid_dir_entry));
+
+ clflush_cache_range(&table[0], SZ_4K);
+ clflush_cache_range(&dir[0], dir_size);
+}

The PASID table is currently active and in use by the hardware. Clearing
the entries without the necessary hardware cache invalidation is buggy.

It seems this manual clearing is a workaround because PASID domain
preservation isn't supported yet. If so, rather than clearing the table
blindly, the code should verify if any PASIDs (other than
IOMMU_NO_PASID) are actually in use. If there are, the preserve callback
should return an error.

Thanks for looking into this, I agree and will remove the clearing logic
here.

Yes, we do this check in iommufd, as it is the dma owner of the device,
and only DMA owned devices are allowed to be preserved.

During preservation iommufd returns an error if the device has PASID
(non NO_PASID) attachments. And once the device is preserved, any PASID
attachments are not allowed until the device is unpreserved.

I think I will make this check robust by moving it into core and use
pasid_array. It will require some plumbing as pasid_array exists in
iommu.c file.

Or anything I overlooked here?

Thanks,
baolu

Thanks,
Sami