Op 13-11-2024 om 03:13 schreef Baolu Lu:
On 11/13/24 05:22, Kees Bakker wrote:
Op 13-10-2023 om 15:58 schreef Jingqi Liu:
Add a debugfs directory per pair of {device, pasid} if the mappings of
its page table are created and destroyed by the iommu_map/unmap()
interfaces. i.e. /sys/kernel/debug/iommu/intel/<device source id>/ <pasid>.
Create a debugfs file in the directory for users to dump the page
table corresponding to {device, pasid}. e.g.
/sys/kernel/debug/iommu/intel/0000:00:02.0/1/domain_translation_struct.
For the default domain without pasid, it creates a debugfs file in the
debugfs device directory for users to dump its page table. e.g.
/sys/kernel/debug/iommu/intel/0000:00:02.0/domain_translation_struct.
When setting a domain to a PASID of device, create a debugfs file in
the pasid debugfs directory for users to dump the page table of the
specified pasid. Remove the debugfs device directory of the device
when releasing a device. e.g.
/sys/kernel/debug/iommu/intel/0000:00:01.0
Signed-off-by: Jingqi Liu <Jingqi.liu@xxxxxxxxx>
---
drivers/iommu/intel/debugfs.c | 53 ++++++++++++++++++++++++++++++ +----
drivers/iommu/intel/iommu.c | 7 +++++
drivers/iommu/intel/iommu.h | 14 +++++++++
3 files changed, 69 insertions(+), 5 deletions(-)
diff --git a/drivers/iommu/intel/debugfs.c b/drivers/iommu/intel/ debugfs.c
[...]
+/* Remove the device pasid debugfs directory. */
+void intel_iommu_debugfs_remove_dev_pasid(struct dev_pasid_info *dev_pasid)
+{
+ debugfs_remove_recursive(dev_pasid->debugfs_dentry);
+}
diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
[...]
@@ -4710,6 +4713,7 @@ static void intel_iommu_remove_dev_pasid(struct device *dev, ioasid_t pasid)
spin_unlock_irqrestore(&dmar_domain->lock, flags);
domain_detach_iommu(dmar_domain, iommu);
+ intel_iommu_debugfs_remove_dev_pasid(dev_pasid);
kfree(dev_pasid);
out_tear_down:
intel_pasid_tear_down_entry(iommu, dev, pasid, false);
So, a call to intel_iommu_debugfs_remove_dev_pasid() was added.
There is a potential problem that dev_pasid can be NULL.
The diff doesn't show the whole context so let me give that here.
Today that piece of the code looks like this
list_for_each_entry(curr, &dmar_domain->dev_pasids, link_domain) {
if (curr->dev == dev && curr->pasid == pasid) {
list_del(&curr->link_domain);
dev_pasid = curr;
break;
}
}
WARN_ON_ONCE(!dev_pasid);
spin_unlock_irqrestore(&dmar_domain->lock, flags);
cache_tag_unassign_domain(dmar_domain, dev, pasid);
domain_detach_iommu(dmar_domain, iommu);
intel_iommu_debugfs_remove_dev_pasid(dev_pasid);
kfree(dev_pasid);
The for_each loop can exit without finding an entry.
The WARN_ON_ONCE also suggests that there can be a NULL.
After that the new function is called (see above) and it will
dereference the NULL pointer.
Can you have a closer look?
It's already a kernel bug if dev_pasid is NULL when it reaches here. If
that happens, we should fix the bug, not avoid using the NULL pointer.
How about moving the WARN_ON_ONCE down a bit and use its return value?
Like so
diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
index 527f6f89d8a1..204873976ef3 100644
--- a/drivers/iommu/intel/iommu.c
+++ b/drivers/iommu/intel/iommu.c
@@ -4096,13 +4096,14 @@ void domain_remove_dev_pasid(struct iommu_domain *domain,
break;
}
}
- WARN_ON_ONCE(!dev_pasid);
spin_unlock_irqrestore(&dmar_domain->lock, flags);
cache_tag_unassign_domain(dmar_domain, dev, pasid);
domain_detach_iommu(dmar_domain, iommu);
- intel_iommu_debugfs_remove_dev_pasid(dev_pasid);
- kfree(dev_pasid);
+ if (!WARN_ON_ONCE(!dev_pasid)) {
+ intel_iommu_debugfs_remove_dev_pasid(dev_pasid);
+ kfree(dev_pasid);
+ }
}
static void intel_iommu_remove_dev_pasid(struct device *dev, ioasid_t pasid,
Would you accept a patch like that? Or do you want to change it yourself?