Re: [PATCH V4 05/18] iommu/ioasid: Redefine IOASID set and allocation APIs

From: Jason Gunthorpe
Date: Tue Jun 01 2021 - 08:57:37 EST


On Tue, Jun 01, 2021 at 02:03:33PM +1000, David Gibson wrote:
> On Thu, May 27, 2021 at 03:48:47PM -0300, Jason Gunthorpe wrote:
> > On Thu, May 27, 2021 at 02:58:30PM +1000, David Gibson wrote:
> > > On Tue, May 25, 2021 at 04:52:57PM -0300, Jason Gunthorpe wrote:
> > > > On Wed, May 26, 2021 at 12:56:30AM +0530, Kirti Wankhede wrote:
> > > >
> > > > > 2. iommu backed mdev devices for SRIOV where mdev device is created per
> > > > > VF (mdev device == VF device) then that mdev device has same iommu
> > > > > protection scope as VF associated to it.
> > > >
> > > > This doesn't require, and certainly shouldn't create, a fake group.
> > >
> > > It's only fake if you start with a narrow view of what a group is.
> >
> > A group is connected to drivers/iommu. A group object without *any*
> > relation to drivers/iommu is just a complete fiction, IMHO.
>
> That might be where we differ. As I've said, my group I'm primarily
> meaning the fundamental hardware unit of isolation. *Usually* that's
> determined by the capabilities of an IOMMU, but in some cases it might
> not be. In either case, the boundaries still matter.

As in my other email we absolutely need a group concept, it is just a
question of how the user API is designed around it.

> > The group mdev implicitly creates is just a fake proxy that comes
> > along with mdev API. It doesn't do anything and it doesn't mean
> > anything.
>
> But.. the case of multiple mdevs managed by a single PCI device with
> an internal IOMMU also exists, and then the mdev groups are *not*
> proxies but true groups independent of the parent device. Which means
> that the group structure of mdevs can vary, which is an argument *for*
> keeping it, not against.

If VFIO becomes more "vfio_device" centric then the vfio_device itself
has some properties. One of those can be "is it inside a drivers/iommu
group, or not?".

If the vfio_device is not using a drivers/iommu IOMMU interface then
it can just have no group at all - no reason to lie. This would mean
that the device has perfect isolation.

What I don't like is forcing certain things depending on how the
vfio_device was created - for instance forcing a IOMMU group as part
and forcing an ugly "SW IOMMU" mode in the container only as part of
mdev_device.

These should all be properties of the vfio_device itself.

Again this is all about the group fd - and how to fit in with the
/dev/ioasid proposal from Kevin:

https://lore.kernel.org/kvm/MWHPR11MB1886422D4839B372C6AB245F8C239@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx/

Focusing on vfio_device and skipping the group fd smooths out some
rough edges.

Code wise we are not quite there, but I have mapped out eliminating
the group from the vfio_device centric API and a few other places it
has crept in.

The group can exist in the background to enforce security without
being a cornerstone of the API design.

Jason