RE: [RFC 05/20] vfio/pci: Register device to /dev/vfio/devices

From: Tian, Kevin
Date: Wed Sep 22 2021 - 19:49:26 EST


> From: Alex Williamson <alex.williamson@xxxxxxxxxx>
> Sent: Thursday, September 23, 2021 5:17 AM
>
> On Wed, 22 Sep 2021 01:19:08 +0000
> "Tian, Kevin" <kevin.tian@xxxxxxxxx> wrote:
>
> > > From: Alex Williamson <alex.williamson@xxxxxxxxxx>
> > > Sent: Wednesday, September 22, 2021 5:09 AM
> > >
> > > On Tue, 21 Sep 2021 13:40:01 -0300
> > > Jason Gunthorpe <jgg@xxxxxxxxxx> wrote:
> > >
> > > > On Sun, Sep 19, 2021 at 02:38:33PM +0800, Liu Yi L wrote:
> > > > > This patch exposes the device-centric interface for vfio-pci devices. To
> > > > > be compatiable with existing users, vfio-pci exposes both legacy group
> > > > > interface and device-centric interface.
> > > > >
> > > > > As explained in last patch, this change doesn't apply to devices which
> > > > > cannot be forced to snoop cache by their upstream iommu. Such
> devices
> > > > > are still expected to be opened via the legacy group interface.
> > >
> > > This doesn't make much sense to me. The previous patch indicates
> > > there's work to be done in updating the kvm-vfio contract to understand
> > > DMA coherency, so you're trying to limit use cases to those where the
> > > IOMMU enforces coherency, but there's QEMU work to be done to
> support
> > > the iommufd uAPI at all. Isn't part of that work to understand how KVM
> > > will be told about non-coherent devices rather than "meh, skip it in the
> > > kernel"? Also let's not forget that vfio is not only for KVM.
> >
> > The policy here is that VFIO will not expose such devices (no enforce-snoop)
> > in the new device hierarchy at all. In this case QEMU will fall back to the
> > group interface automatically and then rely on the existing contract to
> connect
> > vfio and QEMU. It doesn't need to care about the whatever new contract
> > until such devices are exposed in the new interface.
> >
> > yes, vfio is not only for KVM. But here it's more a task split based on staging
> > consideration. imo it's not necessary to further split task into supporting
> > non-snoop device for userspace driver and then for kvm.
>
> Patch 10 introduces an iommufd interface for QEMU to learn whether the
> IOMMU enforces DMA coherency, at that point QEMU could revert to the
> legacy interface, or register the iommufd with KVM, or otherwise
> establish non-coherent DMA with KVM as necessary. We're adding cruft
> to the kernel here to enforce an unnecessary limitation.
>
> If there are reasons the kernel can't support the device interface,
> that's a valid reason not to present the interface, but this seems like
> picking a specific gap that userspace is already able to detect from
> this series at the expense of other use cases. Thanks,
>

I see your point now. Yes I agree that the kernel cruft is unnecessary
limitation here. The user should rely on the device/iommufd capability
to decide whether non-coherent DMA should go through legacy or
new interface.

Thanks
Kevin