Re: [PATCH RFC v2 04/11] iommu: Add attach/detach_dev_pasid domain ops

From: Lu Baolu
Date: Mon Apr 04 2022 - 02:47:24 EST


Hi Jason,

On 2022/3/31 3:08, Jason Gunthorpe wrote:
On Tue, Mar 29, 2022 at 01:37:53PM +0800, Lu Baolu wrote:
Attaching an IOMMU domain to a PASID of a device is a generic operation
for modern IOMMU drivers which support PASID-granular DMA address
translation. Currently visible usage scenarios include (but not limited):

- SVA (Shared Virtual Address)
- kernel DMA with PASID
- hardware-assist mediated device

This adds a pair of common domain ops for this purpose and adds some
common helpers to attach/detach a domain to/from a {device, PASID} and
get/put the domain attached to {device, PASID}.

Signed-off-by: Lu Baolu <baolu.lu@xxxxxxxxxxxxxxx>
include/linux/iommu.h | 36 ++++++++++++++++++
drivers/iommu/iommu.c | 88 +++++++++++++++++++++++++++++++++++++++++++
2 files changed, 124 insertions(+)

diff --git a/include/linux/iommu.h b/include/linux/iommu.h
index 29c4c2edd706..a46285488a57 100644
+++ b/include/linux/iommu.h
@@ -269,6 +269,8 @@ struct iommu_ops {
* struct iommu_domain_ops - domain specific operations
* @attach_dev: attach an iommu domain to a device
* @detach_dev: detach an iommu domain from a device
+ * @attach_dev_pasid: attach an iommu domain to a pasid of device
+ * @detach_dev_pasid: detach an iommu domain from a pasid of device
* @map: map a physically contiguous memory region to an iommu domain
* @map_pages: map a physically contiguous set of pages of the same size to
* an iommu domain.
@@ -286,6 +288,10 @@ struct iommu_ops {
struct iommu_domain_ops {
int (*attach_dev)(struct iommu_domain *domain, struct device *dev);
void (*detach_dev)(struct iommu_domain *domain, struct device *dev);
+ int (*attach_dev_pasid)(struct iommu_domain *domain,
+ struct device *dev, ioasid_t id);
+ void (*detach_dev_pasid)(struct iommu_domain *domain,
+ struct device *dev, ioasid_t id);

ID should be pasid for consistency

Sure.


+int iommu_attach_device_pasid(struct iommu_domain *domain,
+ struct device *dev, ioasid_t pasid)
+{
+ struct iommu_group *group;
+ int ret = -EINVAL;
+ void *curr;
+
+ if (!domain->ops->attach_dev_pasid)
+ return -EINVAL;
+
+ group = iommu_group_get(dev);
+ if (!group)
+ return -ENODEV;
+
+ mutex_lock(&group->mutex);
+ /*
+ * To keep things simple, we currently don't support IOMMU groups
+ * with more than one device. Existing SVA-capable systems are not
+ * affected by the problems that required IOMMU groups (lack of ACS
+ * isolation, device ID aliasing and other hardware issues).
+ */
+ if (!iommu_group_singleton_lockdown(group))
+ goto out_unlock;
+
+ xa_lock(&group->pasid_array);
+ curr = __xa_cmpxchg(&group->pasid_array, pasid, NULL,
+ domain, GFP_KERNEL);
+ xa_unlock(&group->pasid_array);

Why the xa_lock/unlock? Just call the normal xa_cmpxchg?

I should use xa_cmpxchg() instead.



+void iommu_detach_device_pasid(struct iommu_domain *domain,
+ struct device *dev, ioasid_t pasid)
+{
+ struct iommu_group *group;
+
+ group = iommu_group_get(dev);
+ if (WARN_ON(!group))
+ return;

This group_get stuff really needs some cleaning, this makes no sense
at all.

If the kref to group can go to zero within this function then the
initial access of the kref is already buggy:

if (group)
kobject_get(group->devices_kobj);

Because it will crash or WARN_ON.

We don't hit this because it is required that a group cannot be
destroyed while a struct device has a driver bound, and all these
paths are driver bound paths.

So none of these group_get/put patterns have any purpose at all, and
now we are adding impossible WARN_ONs to..

The original intention of this check is that the helper is called on the
correct device. I agree that WARN_ON() is unnecessary because NULL
pointer reference will be caught automatically.


+struct iommu_domain *
+iommu_get_domain_for_dev_pasid(struct device *dev, ioasid_t pasid)
+{
+ struct iommu_domain *domain;
+ struct iommu_group *group;
+
+ group = iommu_group_get(dev);
+ if (!group)
+ return NULL;

And now we are doing useless things on a performance path!

Agreed.


+ mutex_lock(&group->mutex);
+ domain = xa_load(&group->pasid_array, pasid);
+ if (domain && domain->type == IOMMU_DOMAIN_SVA)
+ iommu_sva_domain_get_user(domain);
+ mutex_unlock(&group->mutex);
+ iommu_group_put(group);

Why do we need so much locking on a performance path? RCU out of the
xarray..

Not sure I see how this get_user refcounting can work ?

I should move the refcountering things to iommu_domain and make the
change easier for review. Will improve this in the new version.


Jason

Best regards,
baolu