Re: [PATCH RFC v2 00/18] Add VFIO mediated device support and DEV-MSI support for the idxd driver

From: Alex Williamson
Date: Wed Aug 05 2020 - 21:25:10 EST


On Thu, 23 Jul 2020 21:19:30 -0300
Jason Gunthorpe <jgg@xxxxxxxxxxxx> wrote:

> On Tue, Jul 21, 2020 at 11:54:49PM +0000, Tian, Kevin wrote:
> > In a nutshell, applications don't require raw WQ controllability as guest
> > kernel drivers may expect. Extending DSA user space interface to be another
> > passthrough interface just for virtualization needs is less compelling than
> > leveraging established VFIO/mdev framework (with the major merit that
> > existing user space VMMs just work w/o any change as long as they already
> > support VFIO uAPI).
>
> Sure, but the above is how the cover letter should have summarized
> that discussion, not as "it is not much code difference"
>
> > In last review you said that you didn't hard nak this approach and would
> > like to hear opinion from virtualization guys. In this version we CCed KVM
> > mailing list, Paolo (VFIO/Qemu), Alex (VFIO), Samuel (Rust-VMM/Cloud
> > hypervisor), etc. Let's see how they feel about this approach.
>
> Yes, the VFIO community should decide.
>
> If we are doing emulation tasks in the kernel now, then I can think of
> several nice semi-emulated mdevs to propose.
>
> This will not be some one off, but the start of a widely copied
> pattern.

And that's definitely a concern, there should be a reason for
implementing device emulation in the kernel beyond an easy path to get
a device exposed up through a virtualization stack. The entire idea of
mdev is the mediation of access to a device to make it safe for a user
and to fit within the vfio device API. Mediation, emulation, and
virtualization can be hard to differentiate, and there is some degree of
emulation required to fill out the device API, for vfio-pci itself
included. So I struggle with a specific measure of where to draw the
line, and also whose authority it is to draw that line. I don't think
it's solely mine, that's something we need to decide as a community.

If you see this as an abuse of the framework, then let's identify those
specific issues and come up with a better approach. As we've discussed
before, things like basic PCI config space emulation are acceptable
overhead and low risk (imo) and some degree of register emulation is
well within the territory of an mdev driver. Drivers are accepting
some degree of increased attack surface by each addition of a uAPI and
the complexity of those uAPIs, but it seems largely a decision for
those drivers whether they're willing to take on that responsibility
and burden.

At some point, possibly in the near-ish future, we might have a
vfio-user interface with userspace vfio-over-socket servers that might
be able to consume existing uAPIs and offload some of this complexity
and emulation to userspace while still providing an easy path to insert
devices into the virtualization stack. Hopefully if/when that comes
along, it would provide these sorts of drivers an opportunity to
offload some of the current overhead out to userspace, but I'm not sure
it's worth denying a mainline implementation now. Thanks,

Alex