Re: [PATCH 1/8] iommu: Introduce a replace API for device pasid
From: Jason Gunthorpe
Date: Wed Mar 20 2024 - 08:38:22 EST
On Tue, Mar 19, 2024 at 03:29:39PM +0800, Yi Liu wrote:
> On 2024/3/19 00:52, Jason Gunthorpe wrote:
> > On Wed, Mar 13, 2024 at 04:11:41PM +0800, Yi Liu wrote:
> >
> > > yes. how about your opinion? @Jason. I noticed the set_dev_pasid callback
> > > and pasid_array update is under the group->lock, so update it should be
> > > fine to adjust the order to update pasid_array after set_dev_pasid returns.
> >
> > Yes, it makes some sense
> >
> > But, also I would like it very much if we just have the core pass in
> > the actual old domain as a an addition function argument.
>
> ok, this works too. For normal attach, just pass in a NULL old domain.
>
> > I think we have some small mistakes in multi-device group error
> > unwinding for remove because the global xarray can't isn't actually
> > going to be correct in all scenarios.
>
> do you mean the __iommu_remove_group_pasid() call in the below function?
> Currently, it is called when __iommu_set_group_pasid() failed. However,
> __iommu_set_group_pasid() may need to do remove itself when error happens,
> so the helper can be more self-contained. Or you mean something else?
Yes..
> int iommu_attach_device_pasid(struct iommu_domain *domain,
> struct device *dev, ioasid_t pasid)
> {
> /* Caller must be a probed driver on dev */
> struct iommu_group *group = dev->iommu_group;
> void *curr;
> int ret;
>
> if (!domain->ops->set_dev_pasid)
> return -EOPNOTSUPP;
>
> if (!group)
> return -ENODEV;
>
> if (!dev_has_iommu(dev) || dev_iommu_ops(dev) != domain->owner)
> return -EINVAL;
>
> mutex_lock(&group->mutex);
> curr = xa_cmpxchg(&group->pasid_array, pasid, NULL, domain, GFP_KERNEL);
> if (curr) {
> ret = xa_err(curr) ? : -EBUSY;
> goto out_unlock;
> }
>
> ret = __iommu_set_group_pasid(domain, group, pasid);
So here we have the xa set to the new domain
> if (ret) {
> __iommu_remove_group_pasid(group, pasid);
And here we still have it set to the new domain even though some of
the devices within the group failed to attach. The logic needs to be
more like the main domain attach path where iterate and then undo only
what we did
And the whole thing is easier to reason about if an input argument
specifies the current attached domain instead of having the driver
read it from the xarray.
Jason