Re: [RFC v2 PATCH 00/10] vfio/pci: Add mmap() for DMABUFs
From: Christian König
Date: Fri Mar 13 2026 - 05:23:46 EST
On 3/12/26 19:45, Matt Evans wrote:
> Hi all,
>
>
> There were various suggestions in the September 2025 thread "[TECH
> TOPIC] vfio, iommufd: Enabling user space drivers to vend more
> granular access to client processes" [0], and LPC discussions, around
> improving the situation for multi-process userspace driver designs.
> This RFC series implements some of these ideas.
>
> (Thanks for feedback on v1! Revised series, with changes noted
> inline.)
>
> Background: Multi-process USDs
> ==============================
>
> The userspace driver scenario discussed in that thread involves a
> primary process driving a PCIe function through VFIO/iommufd, which
> manages the function-wide ownership/lifecycle. The function is
> designed to provide multiple distinct programming interfaces (for
> example, several independent MMIO register frames in one function),
> and the primary process delegates control of these interfaces to
> multiple independent client processes (which do the actual work).
> This scenario clearly relies on a HW design that provides appropriate
> isolation between the programming interfaces.
>
> The two key needs are:
>
> 1. Mechanisms to safely delegate a subset of the device MMIO
> resources to a client process without over-sharing wider access
> (or influence over whole-device activities, such as reset).
>
> 2. Mechanisms to allow a client process to do its own iommufd
> management w.r.t. its address space, in a way that's isolated
> from DMA relating to other clients.
>
>
> mmap() of VFIO DMABUFs
> ======================
>
> This RFC addresses #1 in "vfio/pci: Support mmap() of a VFIO DMABUF",
> implementing the proposals in [0] to add mmap() support to the
> existing VFIO DMABUF exporter.
>
> This enables a userspace driver to define DMABUF ranges corresponding
> to sub-ranges of a BAR, and grant a given client (via a shared fd)
> the capability to access (only) those sub-ranges. The VFIO device fds
> would be kept private to the primary process. All the client can do
> with that fd is map (or iomap via iommufd) that specific subset of
> resources, and the impact of bugs/malice is contained.
>
> (We'll follow up on #2 separately, as a related-but-distinct problem.
> PASIDs are one way to achieve per-client isolation of DMA; another
> could be sharing of a single IOVA space via 'constrained' iommufds.)
>
>
> New in v2: To achieve this, the existing VFIO BAR mmap() path is
> converted to use DMABUFs behind the scenes, in "vfio/pci: Convert BAR
> mmap() to use a DMABUF" plus new helper functions, as Jason/Christian
> suggested in the v1 discussion [3].
>
> This means:
>
> - Both regular and new DMABUF BAR mappings share the same vm_ops,
> i.e. mmap()ing DMABUFs is a smaller change on top of the existing
> mmap().
>
> - The zapping of mappings occurs via vfio_pci_dma_buf_move(), and the
> vfio_pci_zap_bars() originally paired with the _move()s can go
> away. Each DMABUF has a unique address_space.
>
> - It's a step towards future iommufd VFIO Type1 emulation
> implementing P2P, since iommufd can now get a DMABUF from a VA that
> it's mapping for IO; the VMAs' vm_file is that of the backing
> DMABUF.
>
>
> Revocation/reclaim
> ==================
>
> Mapping a BAR subset is useful, but the lifetime of access granted to
> a client needs to be managed well. For example, a protocol between
> the primary process and the client can indicate when the client is
> done, and when it's safe to reuse the resources elsewhere, but cleanup
> can't practically be cooperative.
>
> For robustness, we enable the driver to make the resources
> guaranteed-inaccessible when it chooses, so that it can re-assign them
> to other uses in future.
>
> "vfio/pci: Permanently revoke a DMABUF on request" adds a new VFIO
> device fd ioctl, VFIO_DEVICE_PCI_DMABUF_REVOKE. This takes a DMABUF
> fd parameter previously exported (from that device!) and permanently
> revokes the DMABUF. This notifies/detaches importers, zaps PTEs for
> any mappings, and guarantees no future attachment/import/map/access is
> possible by any means.
>
> A primary driver process would use this operation when the client's
> tenure ends to reclaim "loaned-out" MMIO interfaces, at which point
> the interfaces could be safely re-used.
>
> New in v2: ioctl() on VFIO driver fd, rather than DMABUF fd. A DMABUF
> is revoked using code common to vfio_pci_dma_buf_move(), selectively
> zapping mappings (after waiting for completion on the
> dma_buf_invalidate_mappings() request).
>
>
> BAR mapping access attributes
> =============================
>
> Inspired by Alex [Mastro] and Jason's comments in [0] and Mahmoud's
> work in [1] with the goal of controlling CPU access attributes for
> VFIO BAR mappings (e.g. WC), we can decorate DMABUFs with access
> attributes that are then used by a mapping's PTEs.
>
> I've proposed reserving a field in struct
> vfio_device_feature_dma_buf's flags to specify an attribute for its
> ranges. Although that keeps the (UAPI) struct unchanged, it means all
> ranges in a DMABUF share the same attribute. I feel a single
> attribute-to-mmap() relation is logical/reasonable. An application
> can also create multiple DMABUFs to describe any BAR layout and mix of
> attributes.
>
>
> Tests
> =====
>
> (Still sharing the [RFC ONLY] userspace test/demo program for context,
> not for merge.)
>
> It illustrates & tests various map/revoke cases, but doesn't use the
> existing VFIO selftests and relies on a (tweaked) QEMU EDU function.
> I'm (still) working on integrating the scenarios into the existing
> VFIO selftests.
>
> This code has been tested in mapping DMABUFs of single/multiple
> ranges, aliasing mmap()s, aliasing ranges across DMABUFs, vm_pgoff >
> 0, revocation, shutdown/cleanup scenarios, and hugepage mappings seem
> to work correctly. I've lightly tested WC mappings also (by observing
> resulting PTEs as having the correct attributes...).
>
>
> Fin
> ===
>
> v2 is based on next-20260310 (to build on Leon's recent series
> "vfio: Wait for dma-buf invalidation to complete" [2]).
>
>
> Please share your thoughts! I'd like to de-RFC if we feel this
> approach is now fair.
I only skimmed over it, but at least of hand I couldn't find anything fundamentally wrong.
The locking order seems to change in patch #6. In general I strongly recommend to enable lockdep while testing anyway but explicitly when I see such changes.
Additional to that it might also be a good idea to have a lockdep initcall function which defines the locking order in the way all the VFIO code should follow.
See function dma_resv_lockdep() for an example on how to do that. Especially with mmap support and all the locks involved with that it has proven to be a good practice to have something like that.
Regards,
Christian.
>
>
> Many thanks,
>
>
> Matt
>
>
>
> References:
>
> [0]: https://lore.kernel.org/linux-iommu/20250918214425.2677057-1-amastro@xxxxxx/
> [1]: https://lore.kernel.org/all/20250804104012.87915-1-mngyadam@xxxxxxxxx/
> [2]: https://lore.kernel.org/linux-iommu/20260205-nocturnal-poetic-chamois-f566ad@houat/T/#m310cd07011e3a1461b6fda45e3f9b886ba76571a
> [3]: https://lore.kernel.org/all/20260226202211.929005-1-mattev@xxxxxxxx/
>
> --------------------------------------------------------------------------------
> Changelog:
>
> v2: Respin based on the feedback/suggestions:
>
> - Transform the existing VFIO BAR mmap path to also use DMABUFs behind
> the scenes, and then simply share that code for explicitly-mapped
> DMABUFs.
>
> - Refactors the export itself out of vfio_pci_core_feature_dma_buf,
> and shared by a new vfio_pci_core_mmap_prep_dmabuf helper used by
> the regular VFIO mmap to create a DMABUF.
>
> - Revoke buffers using a VFIO device fd ioctl
>
> v1: https://lore.kernel.org/all/20260226202211.929005-1-mattev@xxxxxxxx/
>
>
> Matt Evans (10):
> vfio/pci: Set up VFIO barmap before creating a DMABUF
> vfio/pci: Clean up DMABUFs before disabling function
> vfio/pci: Add helper to look up PFNs for DMABUFs
> vfio/pci: Add a helper to create a DMABUF for a BAR-map VMA
> vfio/pci: Convert BAR mmap() to use a DMABUF
> vfio/pci: Remove vfio_pci_zap_bars()
> vfio/pci: Support mmap() of a VFIO DMABUF
> vfio/pci: Permanently revoke a DMABUF on request
> vfio/pci: Add mmap() attributes to DMABUF feature
> [RFC ONLY] selftests: vfio: Add standalone vfio_dmabuf_mmap_test
>
> drivers/vfio/pci/Kconfig | 3 +-
> drivers/vfio/pci/Makefile | 3 +-
> drivers/vfio/pci/vfio_pci_config.c | 18 +-
> drivers/vfio/pci/vfio_pci_core.c | 123 +--
> drivers/vfio/pci/vfio_pci_dmabuf.c | 425 +++++++--
> drivers/vfio/pci/vfio_pci_priv.h | 46 +-
> include/uapi/linux/vfio.h | 42 +-
> tools/testing/selftests/vfio/Makefile | 1 +
> .../vfio/standalone/vfio_dmabuf_mmap_test.c | 837 ++++++++++++++++++
> 9 files changed, 1339 insertions(+), 159 deletions(-)
> create mode 100644 tools/testing/selftests/vfio/standalone/vfio_dmabuf_mmap_test.c
>