Re: [RFC] /dev/ioasid uAPI proposal
From: Jason Gunthorpe
Date: Wed Jun 02 2021 - 12:38:05 EST
On Wed, Jun 02, 2021 at 04:57:52PM +1000, David Gibson wrote:
> I don't think presence or absence of a group fd makes a lot of
> difference to this design. Having a group fd just means we attach
> groups to the ioasid instead of individual devices, and we no longer
> need the bookkeeping of "partial" devices.
Oh, I think we really don't want to attach the group to an ioasid, or
at least not as a first-class idea.
The fundamental problem that got us here is we now live in a world
where there are many ways to attach a device to an IOASID:
- A RID binding
- A RID,PASID binding
- A RID,PASID binding for ENQCMD
- A SW TABLE binding
- etc
The selection of which mode to use is based on the specific
driver/device operation. Ie the thing that implements the 'struct
vfio_device' is the thing that has to select the binding mode.
group attachment was fine when there was only one mode. As you say it
is fine to just attach every group member with RID binding if RID
binding is the only option.
When SW TABLE binding was added the group code was hacked up - now the
group logic is choosing between RID/SW TABLE in a very hacky and mdev
specific way, and this is just a mess.
The flow must carry the IOASID from the /dev/iommu to the vfio_device
driver and the vfio_device implementation must choose which binding
mode and parameters it wants based on driver and HW configuration.
eg if two PCI devices are in a group then it is perfectly fine that
one device uses RID binding and the other device uses RID,PASID
binding.
The only place I see for a "group bind" in the uAPI is some compat
layer for the vfio container, and the implementation would be quite
different, we'd have to call each vfio_device driver in the group and
execute the IOASID attach IOCTL.
> > I would say no on the container. /dev/ioasid == the container, having
> > two competing objects at once in a single process is just a mess.
>
> Right. I'd assume that for compatibility, creating a container would
> create a single IOASID under the hood with a compatiblity layer
> translating the container operations to iosaid operations.
It is a nice dream for sure
/dev/vfio could be a special case of /dev/ioasid just with a different
uapi and ending up with only one IOASID. They could be interchangable
from then on, which would simplify the internals of VFIO if it
consistently delt with these new ioasid objects everywhere. But last I
looked it was complicated enough to best be done later on
Jason