Re: [RFC PATCH 3/7] vfio/pci: Support mmap() of a DMABUF

From: Christian König

Date: Mon Mar 02 2026 - 05:13:37 EST


On 2/27/26 23:04, Jason Gunthorpe wrote:
> On Fri, Feb 27, 2026 at 01:52:15PM -0800, Alex Mastro wrote:
>> On Fri, Feb 27, 2026 at 03:48:07PM -0400, Jason Gunthorpe wrote:
>>>>> I actually would like to go the other way and have VFIO always have a
>>>>> DMABUF under the VMA's it mmaps because that will make it easy to
>>>>> finish the type1 emulation which requires finding dmabufs for the
>>>>> VMAs.
>>>
>>> This is a still better idea since it avoid duplicating the VMA flow
>>> into two parts..
>>
>> I suppose this would also compose with your idea to use dma-buf for
>> iommufd_compat support of VFIO_IOMMU_MAP_DMA of vfio device fd-backed mmap()s
>> [1]? Instead of needing to materialize a new dma-buf, you could use the existing
>> backing one?
>
> Yeah, that too
>
> I think it is a fairly easy progression:
>
> 1) mmap_prepare() allocates a new dmabuf file * and sticks it in
> desc->vm_file. Rework so all the vma_ops are using vm_file that is
> a dmabuf. The allocated dmabuf has a singleton range

Interesting approach to fix this, but I would suggest something even simpler:

Use the same structure as base class for the VFIO and DMA-buf file for your vma->vm_file->private_data object.

The DMA-buf file actually contains the real ranges exposed by it and pointing to the exporting VFIO, while the one for the VFIO is just a dummy covering the whole range and pointing to itself.

This way you should be able to use the same vm_operations_struct for VMAs mapped through both DMA-buf and the VFIO file descriptors.


Independent of how you implement this just one additional warning: huge_fault has caused a number of really hard to debug problems on x86.

As far as I know background is that on x86 pte_special() only works on true leave pte but not pmd/pud.

That in turn results in some nasty surprises when your PFNs are potentially backed by struct pages, e.g. for direct I/O. For example on the resulting mmap() get_user_pages_fast() works, but get_user_pages() doesn't.

I hope that those problems aren't applicable here, but if it is Thomas from the Intel XE team can give you more details on that stuff.

Regards,
Christian.

> 2) Teach the fault handlers to support full range semantics
> 3) Use dmabuf revoke variables/etc in the mmap fault handlers
> 4) Move the address space from the vfio to the dmabuf
> 5) Allow mmaping the dmabuf fd directly which is now only a couple lines
>
> I forget how all the different mmap implementations in vfio interact
> though - but I think the above is good for vfio-pci
>
> Jason