Re: [PATCH v3 1/2] vfio/type1: Simplify bus_type determination

From: Alex Williamson
Date: Mon Jun 27 2022 - 15:21:49 EST


On Fri, 24 Jun 2022 18:51:44 +0100
Robin Murphy <robin.murphy@xxxxxxx> wrote:

> Since IOMMU groups are mandatory for drivers to support, it stands to
> reason that any device which has been successfully added to a group
> must be on a bus supported by that IOMMU driver, and therefore a domain
> viable for any device in the group must be viable for all devices in
> the group. This already has to be the case for the IOMMU API's internal
> default domain, for instance. Thus even if the group contains devices on
> different buses, that can only mean that the IOMMU driver actually
> supports such an odd topology, and so without loss of generality we can
> expect the bus type of any device in a group to be suitable for IOMMU
> API calls.
>
> Furthermore, scrutiny reveals a lack of protection for the bus being
> removed while vfio_iommu_type1_attach_group() is using it; the reference
> that VFIO holds on the iommu_group ensures that data remains valid, but
> does not prevent the group's membership changing underfoot.
>
> We can address both concerns by recycling vfio_bus_type() into some
> superficially similar logic to indirect the IOMMU API calls themselves.
> Each call is thus protected from races by the IOMMU group's own locking,
> and we no longer need to hold group-derived pointers beyond that scope.
> It also gives us an easy path for the IOMMU API's migration of bus-based
> interfaces to device-based, of which we can already take the first step
> with device_iommu_capable(). As with domains, any capability must in
> practice be consistent for devices in a given group - and after all it's
> still the same capability which was expected to be consistent across an
> entire bus! - so there's no need for any complicated validation.
>
> Signed-off-by: Robin Murphy <robin.murphy@xxxxxxx>
> ---
>
> v3: Complete rewrite yet again, and finally it doesn't feel like we're
> stretching any abstraction boundaries the wrong way, and the diffstat
> looks right too. I did think about embedding IOMMU_CAP_INTR_REMAP
> directly in the callback, but decided I like the consistency of minimal
> generic wrappers. And yes, if the capability isn't supported then it
> does end up getting tested for the whole group, but meh, it's harmless.
>
> drivers/vfio/vfio_iommu_type1.c | 42 +++++++++++++++++----------------
> 1 file changed, 22 insertions(+), 20 deletions(-)
>
> diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c
> index c13b9290e357..a77ff00c677b 100644
> --- a/drivers/vfio/vfio_iommu_type1.c
> +++ b/drivers/vfio/vfio_iommu_type1.c
> @@ -1679,18 +1679,6 @@ static int vfio_dma_do_map(struct vfio_iommu *iommu,
> return ret;
> }
>
> -static int vfio_bus_type(struct device *dev, void *data)
> -{
> - struct bus_type **bus = data;
> -
> - if (*bus && *bus != dev->bus)
> - return -EINVAL;
> -
> - *bus = dev->bus;
> -
> - return 0;
> -}
> -
> static int vfio_iommu_replay(struct vfio_iommu *iommu,
> struct vfio_domain *domain)
> {
> @@ -2153,13 +2141,25 @@ static void vfio_iommu_iova_insert_copy(struct vfio_iommu *iommu,
> list_splice_tail(iova_copy, iova);
> }
>

Any objection if I add the following comment:

/* Redundantly walks non-present capabilities to simplify caller */

Thanks,
Alex

> +static int vfio_iommu_device_capable(struct device *dev, void *data)
> +{
> + return device_iommu_capable(dev, (enum iommu_cap)data);
> +}
> +
> +static int vfio_iommu_domain_alloc(struct device *dev, void *data)
> +{
> + struct iommu_domain **domain = data;
> +
> + *domain = iommu_domain_alloc(dev->bus);
> + return 1; /* Don't iterate */
> +}
> +
> static int vfio_iommu_type1_attach_group(void *iommu_data,
> struct iommu_group *iommu_group, enum vfio_group_type type)
> {
> struct vfio_iommu *iommu = iommu_data;
> struct vfio_iommu_group *group;
> struct vfio_domain *domain, *d;
> - struct bus_type *bus = NULL;
> bool resv_msi, msi_remap;
> phys_addr_t resv_msi_base = 0;
> struct iommu_domain_geometry *geo;
> @@ -2192,18 +2192,19 @@ static int vfio_iommu_type1_attach_group(void *iommu_data,
> goto out_unlock;
> }
>
> - /* Determine bus_type in order to allocate a domain */
> - ret = iommu_group_for_each_dev(iommu_group, &bus, vfio_bus_type);
> - if (ret)
> - goto out_free_group;
> -
> ret = -ENOMEM;
> domain = kzalloc(sizeof(*domain), GFP_KERNEL);
> if (!domain)
> goto out_free_group;
>
> + /*
> + * Going via the iommu_group iterator avoids races, and trivially gives
> + * us a representative device for the IOMMU API call. We don't actually
> + * want to iterate beyond the first device (if any).
> + */
> ret = -EIO;
> - domain->domain = iommu_domain_alloc(bus);
> + iommu_group_for_each_dev(iommu_group, &domain->domain,
> + vfio_iommu_domain_alloc);
> if (!domain->domain)
> goto out_free_domain;
>
> @@ -2258,7 +2259,8 @@ static int vfio_iommu_type1_attach_group(void *iommu_data,
> list_add(&group->next, &domain->group_list);
>
> msi_remap = irq_domain_check_msi_remap() ||
> - iommu_capable(bus, IOMMU_CAP_INTR_REMAP);
> + iommu_group_for_each_dev(iommu_group, (void *)IOMMU_CAP_INTR_REMAP,
> + vfio_iommu_device_capable);
>
> if (!allow_unsafe_interrupts && !msi_remap) {
> pr_warn("%s: No interrupt remapping support. Use the module param \"allow_unsafe_interrupts\" to enable VFIO IOMMU support on this platform\n",