Re: [PATCH V4 05/18] iommu/ioasid: Redefine IOASID set and allocation APIs

From: David Gibson
Date: Mon Jun 07 2021 - 22:25:58 EST


On Tue, Jun 01, 2021 at 09:57:12AM -0300, Jason Gunthorpe wrote:
> On Tue, Jun 01, 2021 at 02:03:33PM +1000, David Gibson wrote:
> > On Thu, May 27, 2021 at 03:48:47PM -0300, Jason Gunthorpe wrote:
> > > On Thu, May 27, 2021 at 02:58:30PM +1000, David Gibson wrote:
> > > > On Tue, May 25, 2021 at 04:52:57PM -0300, Jason Gunthorpe wrote:
> > > > > On Wed, May 26, 2021 at 12:56:30AM +0530, Kirti Wankhede wrote:
> > > > >
> > > > > > 2. iommu backed mdev devices for SRIOV where mdev device is created per
> > > > > > VF (mdev device == VF device) then that mdev device has same iommu
> > > > > > protection scope as VF associated to it.
> > > > >
> > > > > This doesn't require, and certainly shouldn't create, a fake group.
> > > >
> > > > It's only fake if you start with a narrow view of what a group is.
> > >
> > > A group is connected to drivers/iommu. A group object without *any*
> > > relation to drivers/iommu is just a complete fiction, IMHO.
> >
> > That might be where we differ. As I've said, my group I'm primarily
> > meaning the fundamental hardware unit of isolation. *Usually* that's
> > determined by the capabilities of an IOMMU, but in some cases it might
> > not be. In either case, the boundaries still matter.
>
> As in my other email we absolutely need a group concept, it is just a
> question of how the user API is designed around it.
>
> > > The group mdev implicitly creates is just a fake proxy that comes
> > > along with mdev API. It doesn't do anything and it doesn't mean
> > > anything.
> >
> > But.. the case of multiple mdevs managed by a single PCI device with
> > an internal IOMMU also exists, and then the mdev groups are *not*
> > proxies but true groups independent of the parent device. Which means
> > that the group structure of mdevs can vary, which is an argument *for*
> > keeping it, not against.
>
> If VFIO becomes more "vfio_device" centric then the vfio_device itself
> has some properties. One of those can be "is it inside a drivers/iommu
> group, or not?".
>
> If the vfio_device is not using a drivers/iommu IOMMU interface then
> it can just have no group at all - no reason to lie. This would mean
> that the device has perfect isolation.

When you say "not using a drivers/iommu IOMMU interface" do you
basically mean the device doesn't do DMA? I can see some benefit to
that, but some drawbacks too. The *main* form of isolation (or lack
thereof) that groups is about the IOMMU, but groups can also represent
other forms of isolation failure: e.g. a multifunction device, where
function 0 has some debug registers which affect other functions.
That's relevant whether or not any of those functions use DMA.

Now, we could represent those different sorts of isolation separately,
but at the time our thinking was that we should group together devices
that can't be safely isolated for *any* reason, since the practical
upshot is the same: you can't safely split those devices between
different owners.

> What I don't like is forcing certain things depending on how the
> vfio_device was created - for instance forcing a IOMMU group as part
> and forcing an ugly "SW IOMMU" mode in the container only as part of
> mdev_device.

I don't really see how this is depending on how the device is created.
The current VFIO model is that every device always belongs to a group
- but that group might be a singleton. That seems less complicated to
me that some devices do and some don't have a group.

> These should all be properties of the vfio_device itself.
>
> Again this is all about the group fd - and how to fit in with the
> /dev/ioasid proposal from Kevin:
>
> https://lore.kernel.org/kvm/MWHPR11MB1886422D4839B372C6AB245F8C239@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx/
>
> Focusing on vfio_device and skipping the group fd smooths out some
> rough edges.
>
> Code wise we are not quite there, but I have mapped out eliminating
> the group from the vfio_device centric API and a few other places it
> has crept in.
>
> The group can exist in the background to enforce security without
> being a cornerstone of the API design.
>
> Jason
>

--
David Gibson | I'll have my music baroque, and my code
david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson

Attachment: signature.asc
Description: PGP signature