Re: [PATCH v1 04/10] iommufd/viommu: Allow drivers to control vdev_id lifecycle

From: Nicolin Chen
Date: Tue Oct 08 2024 - 13:45:40 EST


Sorry for the late reply. Just sat down and started to look at
this series.

On Thu, Sep 05, 2024 at 03:01:19PM -0300, Jason Gunthorpe wrote:
> On Tue, Aug 27, 2024 at 10:02:06AM -0700, Nicolin Chen wrote:
> > The iommufd core provides a lookup helper for an IOMMU driver to find a
> > device pointer by device's per-viommu virtual ID. Yet a driver may need
> > an inverted lookup to find a device's per-viommu virtual ID by a device
> > pointer, e.g. when reporting virtual IRQs/events back to the user space.
> > In this case, it'd be unsafe for iommufd core to do an inverted lookup,
> > as the driver can't track the lifecycle of a viommu object or a vdev_id
> > object.
> >
> > Meanwhile, some HW can even support virtual device ID lookup by its HW-
> > accelerated virtualization feature. E.g. Tegra241 CMDQV HW supports to
> > execute vanilla guest-issued SMMU commands containing virtual Stream ID
> > but requires software to configure a link between virtual Stream ID and
> > physical Stream ID via HW registers. So not only the iommufd core needs
> > a vdev_id lookup table, drivers will want one too.
> >
> > Given the two justifications above, it's the best practice to provide a
> > a pair of set_vdev_id/unset_vdev_id ops in the viommu ops, so a driver
> > can implement them to control a vdev_id's lifecycle, and configure the
> > HW properly if required.
>
> I think the lifecycle rules should be much simpler.
>
> If a nested domain is attached to a STE/RID/device then the vIOMMU
> affiliated with that nested domain is pinned while the STE is in place
>
> So the driver only need to provide locking around attach changing the
> STE's vIOMMU vs async operations translating from a STE to a
> vIOMMU. This can be a simple driver lock of some kind, ie a rwlock
> across the STE table.

I was worried about the async between a vdev link (idev<=>vIOMMU)
and an regular attach link (idev<=>nested_domain).

Though practically the vdev link wouldn't break until the regular
attach link breaks first, it was still not safe from it happening.
So, having a master->lock to protect master->vdev could ensure.

Now, with the vdev being an object refcounting idev/vIOMMU objs,
I think we can simply pin the vdev in iommufd_hw_pagetable_attach
to ensure the vdev won't disappear. Then, a driver wouldn't need
to worry about that. [1]

Meanwhile, a nested_domain pins the vIOMMU object in the iommufd
level upon allocation. [2]

So, what's missing seems to be the attach between the master->dev
and the nested_domain itself [3]

idev <----------- vdev [1] ---------> vIOMMU
(dev) ---[3]--> nested_domain ----[2]-----^

A lock, protecting the attach(), which is in parallel with iommu
core's group->mutex could help I think?

> Generally that is how all the async events should work, go from the
> STE to the VIOMMU to a iommufd callback to the iommufd event
> queue. iommufd will translate the struct device from the driver to an
> idev_id (or maybe even a vdevid) the same basic way the PRI code works

My first attempt was putting the vdev into the attach_handle that
was created for PRI code. Yet later on I found the links were too
long and some of them weren't safe (perhaps with the new vdev and
those pins mentioned above, it worth another try). So letting the
driver hold the vdev itself became much simpler.

Thanks
Nicolin