[PATCH v3 00/10] vfio: capability chains, sparse mmaps, device specific regions, IGD support
From: Alex Williamson
Date: Tue Feb 16 2016 - 16:09:44 EST
v3:
Dropped GMCH and BDSM clearing and protection from the kernel. The
more I thought about this, clearing these registers isn't really
vfio's responsibility. It doesn't protect the host, it's only a
feeble attempt to prevent the device from stomping on user memory.
But any DMA capable device has the ability to do that, the user needs
to know how to drive it or shouldn't be touching it. This isn't gone
completely, QEMU just needs to do it.
NB, the ASL Storage register still needs protection because unlike the
above two, it's a read-write scratch register. Giving the user the
ability to write directly to it makes for a one-time use device. So
we still virtualize that in the kernel.
Rebased to v4.5-rc4.
v2:
v2 includes more IGD support. Read only access to the host bridge
and LPC bridge config space is provided to allow configuration of
emulated devices for VM use cases. We also try to hide the stolen
memory window from the user. This probably needs additional work as
I'd prefer not to need to update the code for every new graphics chip.
Perhaps we should instead blacklist anything pre-SandyBridge, identify
SandyBridge specifically, and hope that anything unknown after that
uses the gen8+ register layout. Additionally since IGD doesn't have
a real ROM BAR, but does typically have its vBIOS in shadow ROM space,
expose that as if it were a ROM BAR. This series should be used in
conjunction with:
Subject: [PATCH] pci: Wait for up to an additional 1000ms after FLR reset
for use on laptops.
v1:
We have a number of cases were we want to extend the vfio API to
provide further details in the vfio INFO ioctls. For instance we take
it as implicit that we can't mmap over MSI-X vector tables of a BAR,
but we'd prefer to have the API define that explicitly as a sparse
mmap capable region. We have some devices that need additional
regions, but we don't want to "burn" a region index for something
specific to a single device. We also have the ongoing problem of
describing valid IOVA ranges for an IOMMU. This series doesn't solve
every case of those problems, but it solves some and gives us the vfio
level API to solve the others.
To do this we use capability chains, much like they're used in PCI.
A flag bit in the INFO ioctl structure tells us whether a capability
chain is present and new fields are defined to provide the buffer
index of the first capability. Each capability provides the start
index of the next capability along with an identifier and version of
itself. The existing argsz field of is used to convey to the user the
necessary buffer size to retrieve all of the capabilities. A few
helpers in the vfio core simplifies the mechanics of adding
capabilities for the bus and iommu drivers to make use of.
The sparse mmap capability solves the problem of regions which can
only be partially mmaped, such as when an MSI-X table is present.
This is also expected to be useful for vGPU support should a device
have a mix of direct access and emulated access within the same
region.
The device specific region capability allows us to easily add new
regions that are device specific. Included here is the IGD OpRegion,
which is a host memory region exclusively for the configuration and
use of Intel graphics devices, but is not part of the device in the
PCI sense. There are potentially other regions we can expose on this
device to further facilitate use of it.
I particularly welcome feedback on how we identify device specific
regions. Here I've used a type and sub-type field where I've defined
one bit of the type field to identify a vendor specific type with a
mask to identify the vendor. In the Opregion case here, that defines
an 8086 set of sub-types where I've simply defined sub-type 1 as an
IGD OpRegion. We could of course get the vendor from the device
itself, but this method might promote code re-use if we eventually
have multiple vendors using regions for the same purpose. At least
that's my thinking.
Appreciate feedback. Thanks,
Alex
---
Alex Williamson (10):
vfio: Define capability chains
vfio: Add capability chain helpers
vfio: Define sparse mmap capability for regions
vfio/pci: Include sparse mmap capability for MSI-X table regions
vfio: Define device specific region type capability
vfio/pci: Add infrastructure for additional device specific regions
vfio/pci: Enable virtual register in PCI config space
vfio/pci: Intel IGD OpRegion support
vfio/pci: Intel IGD host and LCP bridge config space access
vfio/pci: Expose shadow ROM as PCI option ROM
drivers/vfio/pci/Kconfig | 4 +
drivers/vfio/pci/Makefile | 1
drivers/vfio/pci/vfio_pci.c | 176 +++++++++++++++++++++-
drivers/vfio/pci/vfio_pci_config.c | 45 +++++-
drivers/vfio/pci/vfio_pci_igd.c | 280 +++++++++++++++++++++++++++++++++++
drivers/vfio/pci/vfio_pci_private.h | 39 +++++
drivers/vfio/pci/vfio_pci_rdwr.c | 9 +
drivers/vfio/vfio.c | 54 +++++++
include/linux/vfio.h | 11 +
include/uapi/linux/vfio.h | 92 +++++++++++-
10 files changed, 692 insertions(+), 19 deletions(-)
create mode 100644 drivers/vfio/pci/vfio_pci_igd.c