RE: [RFC 06/20] iommu: Add iommu_device_init[exit]_user_dma interfaces

From: Tian, Kevin
Date: Tue Sep 28 2021 - 03:43:43 EST


> From: Alex Williamson <alex.williamson@xxxxxxxxxx>
> Sent: Tuesday, September 28, 2021 3:20 AM
>
> On Mon, 27 Sep 2021 13:32:34 +0000
> "Tian, Kevin" <kevin.tian@xxxxxxxxx> wrote:
>
> > > From: Jason Gunthorpe
> > > Sent: Monday, September 27, 2021 9:10 PM
> > >
> > > On Mon, Sep 27, 2021 at 01:00:08PM +0000, Tian, Kevin wrote:
> > >
> > > > > I think for such a narrow usage you should not change the struct
> > > > > device_driver. Just have pci_stub call a function to flip back to user
> > > > > mode.
> > > >
> > > > Here we want to ensure that kernel dma should be blocked
> > > > if the group is already marked for user-dma. If we just blindly
> > > > do it for any driver at this point (as you commented earlier):
> > > >
> > > > + ret = iommu_set_kernel_ownership(dev);
> > > > + if (ret)
> > > > + return ret;
> > > >
> > > > how would pci-stub reach its function to indicate that it doesn't
> > > > do dma and flip back?
> > >
> > > > Do you envision a simpler policy that no driver can be bound
> > > > to the group if it's already set for user-dma? what about vfio-pci
> > > > itself?
> > >
> > > Yes.. I'm not sure there is a good use case to allow the stub drivers
> > > to load/unload while a VFIO is running. At least, not a strong enough
> > > one to justify a global change to the driver core..
> >
> > I'm fine with not loading pci-stub. From the very 1st commit msg
> > looks pci-stub was introduced before vfio to prevent host driver
> > loading when doing device assignment with KVM. I'm not sure
> > whether other usages are built on pci-stub later, but in general it's
> > not good to position devices in a same group into different usages.
>
> IIRC, pci-stub was invented for legacy KVM device assignment because
> KVM was never an actual device driver, it just latched onto and started
> using the device. If there was an existing driver for the device then
> KVM would fail to get device resources. Therefore the device needed to
> be unbound from its standard host driver, but that left it susceptible
> to driver loads usurping the device. Therefore pci-stub came along to
> essentially claim the device on behalf of KVM.
>
> With vfio, there are a couple use cases of pci-stub that can be
> interesting. The first is that pci-stub is generally built into the
> kernel, not as a module, which provides users the ability to specify a
> list of ids for pci-stub to claim on the kernel command line with
> higher priority than loadable modules. This can prevent default driver
> bindings to devices until tools like driverctl or boot time scripting
> gets a shot to load the user designated driver for a device.
>
> The other use case, is that if a group is composed of multiple devices
> and all those devices are bound to vfio drivers, then the user can gain
> direct access to each of those devices. If we wanted to insert a
> barrier to restrict user access to certain devices within a group, we'd
> suggest binding those devices to pci-stub. Obviously within a group, it
> may still be possible to manipulate the device via p2p DMA, but the
> barrier is much higher and device, if not platform, specific to
> manipulate such devices. An example use case might be a chipset
> Ethernet controller grouped among system management function in a
> multi-function root complex integrated endpoint.

Thanks for the background. It perfectly reflects how many tricky things
that vfio has evolved to deal with and we'll dig them out again in this
refactoring process with your help. 😊

just a nit on the last example. If a system management function is
in such group, isn't the right policy is to disallow assigning any device
in this group? Even the barrier is high, any chance of allowing the guest
to control a system management function is dangerous...

>
> > but I'm little worried that even vfio-pci itself cannot be bound now,
> > which implies that all devices in a group which are intended to be
> > used by the user must be bound to vfio-pci in a breath before the
> > user attempts to open any of them, i.e. late-binding and device-
> > hotplug is disallowed after the initial open. I'm not sure how
> > important such an usage would be, but it does cause user-tangible
> > semantics change.
>
> Yep, a high potential to break userspace, especially as pci-stub has
> been recommended for the cases noted above. I don't expect that tools
> like libvirt manage unassigned devices within a group, but that
> probably means that there are all sorts of ad-hoc user mechanisms
> beyond simply assigning all the devices. Thanks,
>

Thanks
Kevin