RE: [RFC PATCH 03/10] iommu/vt-d: Allocate groups for mediated devices

From: Tian, Kevin
Date: Tue Jul 24 2018 - 22:17:10 EST

> From: Jean-Philippe Brucker
> Sent: Tuesday, July 24, 2018 7:31 PM
> Hi Baolu,
> On 24/07/18 03:22, Lu Baolu wrote:
> > Hi,
> >
> > On 07/23/2018 12:44 PM, Liu, Yi L wrote:
> >>> From: Lu Baolu []
> >>> Sent: Sunday, July 22, 2018 2:09 PM
> >>>
> >>> With the Intel IOMMU supporting PASID granularity isolation and
> protection, a
> >>> mediated device could be isolated and protected by an IOMMU unit.
> We need to
> >>> allocate a new group instead of a PCI group.
> >> Existing vfio mdev framework also allocates an iommu group for
> mediate device.
> >>
> >> mdev_probe()
> >> |_ mdev_attach_iommu()
> >> |_ iommu_group_alloc()
> >
> > When external components ask iommu to allocate a group for a device,
> > it will call pci_device_group in Intel IOMMU driver's @device_group
> > callback. In another word, current Intel IOMMU driver doesn't aware
> > the mediated device and treat all devices as PCI ones. This patch
> > extends the @device_group call back to make it aware of a mediated
> > device.
> I agree that allocating two groups for an mdev seems strange, and in my
> opinion we shouldn't export the notion of mdev to IOMMU drivers.
> Otherwise each driver will have to add its own "dev_is_mdev()" special
> cases, which will get messy in the long run. Besides, the macro is
> currently private, and to be exported it should be wrapped in
> symbol_get/put(mdev_bus_type).

There is only one group per mdev, but I agree that avoiding mdev
awareness in IOMMU driver is definitely good... We didn't find a
good way before, which is why current RFC was implemented that

> There is another way: as we're planning to add a generic pasid_alloc()
> function to the IOMMU API, the mdev module itself could allocate a
> default PASID for each mdev by calling pasid_alloc() on the mdev's
> parent, and then do map()/unmap() with that PASID. This way we don't
> have to add IOMMU ops to the mdev bus, everything can still be done
> using the ops of the parent. And IOMMU drivers "only" have to implement
> PASID ops, which will be reused by drivers other than mdev.

yes, PASID is more general abstraction than mdev. However there is
one problem. Please note with new scalable mode now PASID is not
used just for SVA. You can do 1st-level, 2nd-level and nested in PASID
granularity, meaning each PASID can be bound to an unique iommu
domain. However today IOMMU driver allows only one iommu_domain
per device. If we simply install allocated PASID to parent device, we
should also remove that iommu_domain limitation. Is it the way that
we want to pursue?

mdev avoids such problem as it introduces a separate device...

> The allocated PASID also needs to be installed into the parent device.
> If the mdev module knows the PASID, we can do that by adding
> set_pasid(mdev, pasid) and clear_pasid(mdev, pasid) operations to
> mdev_parent_ops.