Re: [RFC 13/20] iommu: Extend iommu_at[de]tach_device() for multiple devices group

From: David Gibson
Date: Mon Oct 25 2021 - 19:30:37 EST


On Mon, Oct 25, 2021 at 09:14:10AM -0300, Jason Gunthorpe wrote:
> On Mon, Oct 25, 2021 at 04:14:56PM +1100, David Gibson wrote:
> > On Mon, Oct 18, 2021 at 01:32:38PM -0300, Jason Gunthorpe wrote:
> > > On Mon, Oct 18, 2021 at 02:57:12PM +1100, David Gibson wrote:
> > >
> > > > The first user might read this. Subsequent users are likely to just
> > > > copy paste examples from earlier things without fully understanding
> > > > them. In general documenting restrictions somewhere is never as
> > > > effective as making those restrictions part of the interface signature
> > > > itself.
> > >
> > > I'd think this argument would hold more water if you could point to
> > > someplace in existing userspace that cares about the VFIO grouping.
> >
> > My whole point here is that the proposed semantics mean that we have
> > weird side effects even if the app doesn't think it cares about
> > groups.
> >
> > e.g. App's input is a bunch of PCI addresses for NICs. It attaches
> > each one to a separate IOAS and bridges packets between them all. As
> > far as the app is concerned, it doesn't care about groups, as you say.
> >
> > Except that it breaks if any two of the devices are in the same group.
> > Worse, it has a completely horrible failure mode: no syscall returns
>
> Huh? If an app requests an IOAS attach that is not possible then the
> attachment IOCTL will fail.
>
> The kernel must track groups and know that group A is on IOAS A and
> any further attach of a group A device must specify IOAS A or receive
> a failure.

Ok, I misunderstood the semantics that were suggested.

So, IIUC what you're suggested is that if group X is attached to IOAS
1, then attaching the group to IOAS 1 again should succeed (as a
no-op), but attaching to any other IOAS should fail?

That's certainly an improvement, but there's still some questions.

If you attach devices A and B (both in group X) to IOAS 1, then detach
device A, what happens? Do you detach both devices? Or do you have a
counter so you have to detach as many time as you attached?

> The kernel should never blindly acknowledge a failed attachment.
>
> Jason
>

--
David Gibson | I'll have my music baroque, and my code
david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson

Attachment: signature.asc
Description: PGP signature