Re: [RFC 13/20] iommu: Extend iommu_at[de]tach_device() for multiple devices group

From: Jason Gunthorpe
Date: Mon Oct 25 2021 - 08:14:18 EST


On Mon, Oct 25, 2021 at 04:14:56PM +1100, David Gibson wrote:
> On Mon, Oct 18, 2021 at 01:32:38PM -0300, Jason Gunthorpe wrote:
> > On Mon, Oct 18, 2021 at 02:57:12PM +1100, David Gibson wrote:
> >
> > > The first user might read this. Subsequent users are likely to just
> > > copy paste examples from earlier things without fully understanding
> > > them. In general documenting restrictions somewhere is never as
> > > effective as making those restrictions part of the interface signature
> > > itself.
> >
> > I'd think this argument would hold more water if you could point to
> > someplace in existing userspace that cares about the VFIO grouping.
>
> My whole point here is that the proposed semantics mean that we have
> weird side effects even if the app doesn't think it cares about
> groups.
>
> e.g. App's input is a bunch of PCI addresses for NICs. It attaches
> each one to a separate IOAS and bridges packets between them all. As
> far as the app is concerned, it doesn't care about groups, as you say.
>
> Except that it breaks if any two of the devices are in the same group.
> Worse, it has a completely horrible failure mode: no syscall returns

Huh? If an app requests an IOAS attach that is not possible then the
attachment IOCTL will fail.

The kernel must track groups and know that group A is on IOAS A and
any further attach of a group A device must specify IOAS A or receive
a failure.

The kernel should never blindly acknowledge a failed attachment.

Jason