Re: [PATCH 0/3] vfio: Device memory DMA mapping improvements

From: Jason Gunthorpe
Date: Fri Feb 12 2021 - 16:00:55 EST


On Fri, Feb 12, 2021 at 12:27:19PM -0700, Alex Williamson wrote:
> This series intends to improve some long standing issues with mapping
> device memory through the vfio IOMMU interface (ie. P2P DMA mappings).
> Unlike mapping DMA to RAM, we can't pin device memory, nor is it
> always accessible. We attempt to tackle this (predominantly the
> first issue in this iteration) by creating a registration and
> notification interface through vfio-core, between the IOMMU backend
> and the bus driver. This allows us to do things like automatically
> remove a DMA mapping to device if it's closed by the user. We also
> keep references to the device container such that it remains isolated
> while this mapping exists.
>
> Unlike my previous attempt[1], this version works across containers.
> For example if a user has device1 with IOMMU context in container1
> and device2 in container2, a mapping of device2 memory into container1
> IOMMU context would be removed when device2 is released.
>
> What I don't tackle here is when device memory is disabled, such as
> for a PCI device when the command register memory bit is cleared or
> while the device is in reset. Ideally is seems like it might be
> nice to have IOMMU API interfaces that could remove r/w permissions
> from the IOTLB entry w/o removing it entirely, but I'm also unsure
> of the ultimate value in modifying the IOTLB entries at this point.
>
> In the PCI example again, I'd expect a DMA to disabled or unavailable
> device memory to get an Unsupported Request response. If we play
> with the IOTLB mapping, we might change this to an IOMMU fault for
> either page permissions or page not present, depending on how we
> choose to invalidate that entry. However, it seems that a system that
> escalates an UR error to fatal, through things like firmware first
> handling, is just as likely to also make the IOMMU fault fatal. Are
> there cases where we expect otherwise, and if not is there value to
> tracking device memory enable state to that extent in the IOMMU?
>
> Jason, I'm also curious if this scratches your itch relative to your
> suggestion to solve this with dma-bufs, and if that's still your
> preference, I'd love an outline to accomplish this same with that
> method.

I will look at this more closely later, but given this is solving a
significant security problem and the patches now exist, I'm not
inclined to push too hard to do something different if this works OK.

That said, it is not great to see VFIO create its own little dmabuf
like thing inside itself, in particular if this was in core code we
could add a new vm_operations_struct member like:

struct dmabuf (*getdma_buf)(struct vm_operations_struct *area);

And completely avoid a lot of the searching and fiddling with
ops. Maybe we can make this look closer to that ideal..

Jason