On Tue, 2015-10-06 at 18:23 +0300, Avi Kivity wrote:
On 10/06/2015 05:56 PM, Michael S. Tsirkin wrote:It's not just the iommu model vfio uses, it's that vfio is built around
On Tue, Oct 06, 2015 at 05:43:50PM +0300, Vlad Zolotarov wrote:BAR mapping is already available from sysfs; it is not mandatory.
The only "like VFIO" behavior we implement here is binding the MSI-XThere will be more if you add some basic memory protections.
interrupt notification to eventfd descriptor.
Besides, that's not true.
Your patch queries MSI capability, sets # of vectors.
You even hinted you want to add BAR mapping down the road.
VFIO does all of that.Copying vfio maintainer Alex (hi!).
vfio's charter is modern iommu-capable configurations. It is designed to
be secure enough to be usable by an unprivileged user.
For performance and hardware reasons, many dpdk deployments use
uio_pci_generic. They are willing to trade off the security provided by
vfio for the performance and deployment flexibility of pci_uio_generic.
Forcing these features into vfio will compromise its security and
needlessly complicate its code (I guess it can be done with a "null"
iommu, but then vfio will have to decide whether it is secure or not).
iommu groups. For instance to use a device in vfio, the user opens the
vfio group file and asks for the device within that group. That's a
fairly fundamental part of the mechanics to sidestep.
However, is there an opportunity at a lower level? Systems without an
iommu typically have dma ops handled via a software iotlb (ie. bounce
buffers), but I think they simply don't have iommu ops registered.
Could a no-iommu, iommu subsystem provide enough dummy iommu ops to fake
out vfio? It would need to iterate the devices on the bus and come up
with dummy iommu groups and dummy versions of iommu_map and unmap. The
grouping is easy, one device per group, there's no isolation anyway.
The vfio type1 iommu backend will do pinning, which seems like an
improvement over the mlock that uio users probably try to do now. I
guess the no-iommu map would error if the IOVA isn't simply the bus
address of the page mapped.
Of course this is entirely unsafe and this no-iommu driver should taint
the kernel, but it at least standardizes on one userspace API and you're
already doing completely unsafe things with uio. vfio should be
enlightened at least to the point that it allows only privileged users
access to devices under such a (lack of) iommu.
It is not msix that taints the kernel, it's uio_pci_generic. Msix is aThis doesn't justifies theThis applies to both VFIO and UIO really. I'm not sure the hassle of
hassle of implementing IOMMU-less VFIO mode.
maintaining this functionality in tree is justified. It remains to be
seen whether there are any users that won't taint the kernel.
Apparently not in the current form of the patch, but who knows.
tiny feature addition that doesn't change the security situation one bit.
btw, currently you can map BARs and dd to /dev/mem to your heart's
content without tainting the kernel. I don't see how you can claim that
msix support makes the situation worse, when root can access every bit
of physical memory, either directly or via DMA.