Re: [PATCH v3 2/4] iommu: Add iommu_aux_at(de)tach_group()

From: Alex Williamson
Date: Thu Jul 30 2020 - 15:47:07 EST


On Wed, 29 Jul 2020 23:34:40 +0000
"Tian, Kevin" <kevin.tian@xxxxxxxxx> wrote:

> > From: Alex Williamson <alex.williamson@xxxxxxxxxx>
> > Sent: Thursday, July 30, 2020 4:04 AM
> >
> > On Thu, 16 Jul 2020 09:07:46 +0800
> > Lu Baolu <baolu.lu@xxxxxxxxxxxxxxx> wrote:
> >
> > > Hi Jacob,
> > >
> > > On 7/16/20 12:01 AM, Jacob Pan wrote:
> > > > On Wed, 15 Jul 2020 08:47:36 +0800
> > > > Lu Baolu <baolu.lu@xxxxxxxxxxxxxxx> wrote:
> > > >
> > > >> Hi Jacob,
> > > >>
> > > >> On 7/15/20 12:39 AM, Jacob Pan wrote:
> > > >>> On Tue, 14 Jul 2020 13:57:01 +0800
> > > >>> Lu Baolu<baolu.lu@xxxxxxxxxxxxxxx> wrote:
> > > >>>
> > > >>>> This adds two new aux-domain APIs for a use case like vfio/mdev
> > > >>>> where sub-devices derived from an aux-domain capable device are
> > > >>>> created and put in an iommu_group.
> > > >>>>
> > > >>>> /**
> > > >>>> * iommu_aux_attach_group - attach an aux-domain to an
> > iommu_group
> > > >>>> which
> > > >>>> * contains sub-devices (for example
> > > >>>> mdevs) derived
> > > >>>> * from @dev.
> > > >>>> * @domain: an aux-domain;
> > > >>>> * @group: an iommu_group which contains sub-devices derived
> > from
> > > >>>> @dev;
> > > >>>> * @dev: the physical device which supports
> > IOMMU_DEV_FEAT_AUX.
> > > >>>> *
> > > >>>> * Returns 0 on success, or an error value.
> > > >>>> */
> > > >>>> int iommu_aux_attach_group(struct iommu_domain *domain,
> > > >>>> struct iommu_group *group,
> > > >>>> struct device *dev)
> > > >>>>
> > > >>>> /**
> > > >>>> * iommu_aux_detach_group - detach an aux-domain from an
> > > >>>> iommu_group *
> > > >>>> * @domain: an aux-domain;
> > > >>>> * @group: an iommu_group which contains sub-devices derived
> > from
> > > >>>> @dev;
> > > >>>> * @dev: the physical device which supports
> > IOMMU_DEV_FEAT_AUX.
> > > >>>> *
> > > >>>> * @domain must have been attached to @group via
> > > >>>> iommu_aux_attach_group(). */
> > > >>>> void iommu_aux_detach_group(struct iommu_domain *domain,
> > > >>>> struct iommu_group *group,
> > > >>>> struct device *dev)
> > > >>>>
> > > >>>> It also adds a flag in the iommu_group data structure to identify
> > > >>>> an iommu_group with aux-domain attached from those normal ones.
> > > >>>>
> > > >>>> Signed-off-by: Lu Baolu<baolu.lu@xxxxxxxxxxxxxxx>
> > > >>>> ---
> > > >>>> drivers/iommu/iommu.c | 58
> > > >>>> +++++++++++++++++++++++++++++++++++++++++++
> > include/linux/iommu.h |
> > > >>>> 17 +++++++++++++ 2 files changed, 75 insertions(+)
> > > >>>>
> > > >>>> diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
> > > >>>> index e1fdd3531d65..cad5a19ebf22 100644
> > > >>>> --- a/drivers/iommu/iommu.c
> > > >>>> +++ b/drivers/iommu/iommu.c
> > > >>>> @@ -45,6 +45,7 @@ struct iommu_group {
> > > >>>> struct iommu_domain *default_domain;
> > > >>>> struct iommu_domain *domain;
> > > >>>> struct list_head entry;
> > > >>>> + unsigned int aux_domain_attached:1;
> > > >>>> };
> > > >>>>
> > > >>>> struct group_device {
> > > >>>> @@ -2759,6 +2760,63 @@ int iommu_aux_get_pasid(struct
> > iommu_domain
> > > >>>> *domain, struct device *dev) }
> > > >>>> EXPORT_SYMBOL_GPL(iommu_aux_get_pasid);
> > > >>>>
> > > >>>> +/**
> > > >>>> + * iommu_aux_attach_group - attach an aux-domain to an
> > iommu_group
> > > >>>> which
> > > >>>> + * contains sub-devices (for example
> > > >>>> mdevs) derived
> > > >>>> + * from @dev.
> > > >>>> + * @domain: an aux-domain;
> > > >>>> + * @group: an iommu_group which contains sub-devices derived
> > from
> > > >>>> @dev;
> > > >>>> + * @dev: the physical device which supports
> > IOMMU_DEV_FEAT_AUX.
> > > >>>> + *
> > > >>>> + * Returns 0 on success, or an error value.
> > > >>>> + */
> > > >>>> +int iommu_aux_attach_group(struct iommu_domain *domain,
> > > >>>> + struct iommu_group *group, struct
> > > >>>> device *dev) +{
> > > >>>> + int ret = -EBUSY;
> > > >>>> +
> > > >>>> + mutex_lock(&group->mutex);
> > > >>>> + if (group->domain)
> > > >>>> + goto out_unlock;
> > > >>>> +
> > > >>> Perhaps I missed something but are we assuming only one mdev per
> > > >>> mdev group? That seems to change the logic where vfio does:
> > > >>> iommu_group_for_each_dev()
> > > >>> iommu_aux_attach_device()
> > > >>>
> > > >>
> > > >> It has been changed in PATCH 4/4:
> > > >>
> > > >> static int vfio_iommu_attach_group(struct vfio_domain *domain,
> > > >> struct vfio_group *group)
> > > >> {
> > > >> if (group->mdev_group)
> > > >> return iommu_aux_attach_group(domain->domain,
> > > >> group->iommu_group,
> > > >> group->iommu_device);
> > > >> else
> > > >> return iommu_attach_group(domain->domain,
> > > >> group->iommu_group);
> > > >> }
> > > >>
> > > >> So, for both normal domain and aux-domain, we use the same concept:
> > > >> attach a domain to a group.
> > > >>
> > > > I get that, but don't you have to attach all the devices within the
> > >
> > > This iommu_group includes only mediated devices derived from an
> > > IOMMU_DEV_FEAT_AUX-capable device. Different from
> > iommu_attach_group(),
> > > iommu_aux_attach_group() doesn't need to attach the domain to each
> > > device in group, instead it only needs to attach the domain to the
> > > physical device where the mdev's were created from.
> > >
> > > > group? Here you see the group already has a domain and exit.
> > >
> > > If the (group->domain) has been set, that means a domain has already
> > > attached to the group, so it returns -EBUSY.
> >
> > I agree with Jacob, singleton groups should not be built into the IOMMU
> > API, we're not building an interface just for mdevs or current
> > limitations of mdevs. This also means that setting a flag on the group
> > and passing a device that's assumed to be common for all devices within
> > the group, don't really make sense here. Thanks,
> >
> > Alex
>
> Baolu and I discussed about this assumption before. The assumption is
> not based on singleton groups. We do consider multiple mdevs in one
> group. But our feeling at the moment is that all mdevs (or other AUX
> derivatives) in the same group should come from the same parent
> device, thus comes with above design. Does it sound a reasonable
> assumption to you?

No, the approach in this series doesn't really make sense to me. We
currently have the following workflow as Baolu notes in the cover
letter:

domain = iommu_domain_alloc(bus);

iommu_group_for_each_dev(group...

iommu_device = mdev-magic()

if (iommu_dev_feature_enabled(iommu_device,
IOMMU_DEV_FEAT_AUX))
iommu_aux_attach_device(domain, iommu_device);

And we want to convert this to a group function, like we have for
non-aux domains:

domain = iommu_domain_alloc(bus);

iommu_device = mdev-magic()

iommu_aux_attach_group(domain, group, iommu_device);

And I think we want to do that largely because iommu_group.domain is
private to iommu.c (therefore vfio code cannot set it), but we need it
set in order for iommu_get_domain_for_dev() to work with a group
attached to an aux domain. Passing an iommu_device avoids the problem
that IOMMU API code doesn't know how to derive an iommu_device for each
device in the group, but while doing so it ignores the fundamental
nature of a group as being a set of one or more devices. Even if we
can make the leap that all devices within the group would use the same
iommu_device, an API that sets and aux domain for a group while
entirely ignoring the devices within the group seems very broken.

So, barring adding an abstraction at struct device where an IOMMU API
could retrieve the iommu_device backing anther device (which seems a
very abstract concept for the base class), why not have the caller
provide a lookup function? Ex:

int iommu_aux_attach_group(struct iommu_domain *domain,
struct iommu_group *group,
struct device *(*iommu_device_lookup)(
struct device *dev));

Thus vfio could could simply provide &vfio_mdev_get_iommu_device and
we'd have equivalent functionality to what we have currently, but with
the domain pointer set in the iommu_group.

This also however highlights that our VF backed mdevs will have the
same issue, so maybe this new IOMMU API interface should mimic
vfio_mdev_attach_domain() more directly, testing whether the resulting
device supports IOMMU_DEV_FEAT_AUX and using an aux vs non-aux attach.
I'm not sure what the name of this combined function should be,
iommu_attach_group_with_lookup()? This could be the core
implementation of iommu_attach_group() where the existing function
simply wraps the call with a NULL function pointer.

Anyway, I think there are ways to implement this that are more in line
with the spirit of groups. Thanks,

Alex