RE: [PATCH v2 08/10] iommufd/device: Use iommu_group_replace_domain()

From: Tian, Kevin
Date: Tue Feb 14 2023 - 21:16:12 EST


> From: Nicolin Chen <nicolinc@xxxxxxxxxx>
> Sent: Wednesday, February 15, 2023 9:59 AM
>
> On Wed, Feb 15, 2023 at 01:37:19AM +0000, Tian, Kevin wrote:
>
> > > From: Nicolin Chen <nicolinc@xxxxxxxxxx>
> > > Sent: Tuesday, February 14, 2023 6:54 PM
> > >
> > > On Mon, Feb 13, 2023 at 10:49:34AM -0400, Jason Gunthorpe wrote:
> > > > On Sun, Feb 12, 2023 at 11:48:30PM -0800, Nicolin Chen wrote:
> > > >
> > > > > What about point 1? If dev2 and dev3 are already replaced when
> > > > > doing iommu_group_replace_domain() on dev1, their idev objects
> > > > > still have the old hwpt/iopt until user space does another two
> > > > > IOCTLs on them, right?
> > > >
> > > > We have a complicated model for multi-device groups...
> > > >
> > > > The first device in the group to change domains must move all the
> > > > devices in the group
> > > >
> > > > But userspace is still expected to run through and change all the
> > > > other devices
> > > >
> > > > So replace should be a NOP if the group is already linked to the right
> > > > domain.
> > > >
> > > > This patch needs to make sure that incosistency in the view betwen the
> > > > iommu_group and the iommufd model doesn't cause a functional
> > > > problem.
> > >
> > > Yea, I was thinking that we'd need to block any access to the
> > > idev->hwpt of a pending device's, before the kernel finishes
> > > the "NOP" IOCTL from userspace, maybe with a helper:
> > > (iommu_get_domain_for_dev(idev->dev) != idev->hwpt->domain)
> > >
> >
> > I didn't see what would be broken w/o such blocking measure.
> >
> > Can you elaborate?
>
> iommu_group { idev1, idev2 }
>
> (1) Attach all devices to domain1 {
> group->domain = domain1;
> idev1->hwpt = hwpt1; // domain1
> idev2->hwpt = hwpt1; // domain1
> }
>
> (2) Attach (replace) idev1 only to domain2 {
> group->domain = domain2
> idev1->hwpt = hwpt2; // domain2
> idev2->hwpt == domain1 // != iommu_get_domain_for_dev()
> }
>
> Then if user space isn't aware of these and continues to do
> IOMMU_IOAS_MAP for idev2. Though IOVA mappings may be added
> onto the domain2 correctly, yet domain2's iopt itree won't
> reflect that, until idev2->hwpt is updated too, right?

IOMMU_IOAS_MAP is for ioas instead of for device.

even w/o any device attached we still allow ioas map.

also note in case of domain1 still actively attached to idev3 in
another group, banning IOAS_MAP due to idev2 in transition
is also incorrect for idev3.

>
> And the tricky thing is that, though we advocate a device-
> centric uAPI, we'd still have to ask user space to align the
> devices in the same iommu_group, which is also not present
> in QEMU's RFC v3 series.

IMHO this requirement has been stated clearly from the start
when designing this device centric interface. QEMU should be
improved to take care of it.