Re: [PATCH v2 1/2] vfio/type1: Simplify bus_type determination

From: Alex Williamson
Date: Fri Jun 24 2022 - 12:04:56 EST


On Fri, 24 Jun 2022 16:12:55 +0100
Robin Murphy <robin.murphy@xxxxxxx> wrote:

> On 2022-06-24 15:28, Alex Williamson wrote:
> > On Fri, 24 Jun 2022 11:18:36 -0300
> > Jason Gunthorpe <jgg@xxxxxxxxxx> wrote:
> >
> >> On Fri, Jun 24, 2022 at 08:11:59AM -0600, Alex Williamson wrote:
> >>> On Thu, 23 Jun 2022 22:50:30 -0300
> >>> Jason Gunthorpe <jgg@xxxxxxxxxx> wrote:
> >>>
> >>>> On Thu, Jun 23, 2022 at 05:00:44PM -0600, Alex Williamson wrote:
> >>>>
> >>>>>>>> +struct vfio_device *vfio_device_get_from_iommu(struct iommu_group *iommu_group)
> >>>>>>>> +{
> >>>>>>>> + struct vfio_group *group = vfio_group_get_from_iommu(iommu_group);
> >>>>>>>> + struct vfio_device *device;
> >>>>>>>
> >>>>>>> Check group for NULL.
> >>>>>>
> >>>>>> OK - FWIW in context this should only ever make sense to call with an
> >>>>>> iommu_group which has already been derived from a vfio_group, and I did
> >>>>>> initially consider a check with a WARN_ON(), but then decided that the
> >>>>>> unguarded dereference would be a sufficiently strong message. No problem
> >>>>>> with bringing that back to make it more defensive if that's what you prefer.
> >>>>>
> >>>>> A while down the road, that's a bit too much implicit knowledge of the
> >>>>> intent and single purpose of this function just to simply avoid a test.
> >>>>
> >>>> I think we should just pass the 'struct vfio_group *' into the
> >>>> attach_group op and have this API take that type in and forget the
> >>>> vfio_group_get_from_iommu().
> >>>
> >>> That's essentially what I'm suggesting, the vfio_group is passed as an
> >>> opaque pointer which type1 can use for a
> >>> vfio_group_for_each_vfio_device() type call. Thanks,
> >>
> >> I don't want to add a whole vfio_group_for_each_vfio_device()
> >> machinery that isn't actually needed by anything.. This is all
> >> internal, we don't need to design more than exactly what is needed.
> >>
> >> At this point if we change the signature of the attach then we may as
> >> well just pass in the representative vfio_device, that is probably
> >> less LOC overall.
> >
> > That means that vfio core still needs to pick an arbitrary
> > representative device, which I find in fundamental conflict to the
> > nature of groups. Type1 is the interface to the IOMMU API, if through
> > the IOMMU API we can make an assumption that all devices within the
> > group are equivalent for a given operation, that should be done in type1
> > code, not in vfio core. A for-each interface is commonplace and not
> > significantly more code or design than already proposed. Thanks,
>
> It also occurred to me this morning that there's another middle-ground
> option staring out from the call-wrapping notion I mentioned yesterday -
> while I'm not keen to provide it from the IOMMU API, there's absolutely
> no reason that VFIO couldn't just use the building blocks by itself, and
> in fact it works out almost absurdly simple:
>
> static bool vfio_device_capable(struct device *dev, void *data)
> {
> return device_iommu_capable(dev, (enum iommu_cap)data);
> }
>
> bool vfio_group_capable(struct iommu_group *group, enum iommu_cap cap)
> {
> return iommu_group_for_each_dev(group, (void *)cap, vfio_device_capable);
> }
>
> and much the same for iommu_domain_alloc() once I get that far. The
> locking concern neatly disappears because we're no longer holding any
> bus or device pointer that can go stale. How does that seem as a
> compromise for now, looking forward to Jason's longer-term view of
> rearranging the attach_group process such that a vfio_device falls
> naturally to hand?

Yup, that seems like another way to do it, a slight iteration on the
current bus_type flow, and also avoids any sort of arbitrary
representative device being passed around as an API.

For clarity of the principle that all devices within the group should
have the same capabilities, we could even further follow the existing
bus_type and do a sanity test here at the same time, or perhaps simply
stop after the first device to avoid the if-any-device-is-capable
semantics implied above. Thanks,

Alex