Re: [PATCH 1/9] vfio/pci: Fix vfio_pci_dma_buf_cleanup() double-put

From: Matt Evans

Date: Wed May 06 2026 - 09:53:42 EST


Hi Alex,

On 01/05/2026 20:12, Alex Williamson wrote:

On Thu, 16 Apr 2026 06:17:44 -0700
Matt Evans <mattev@xxxxxxxx> wrote:

vfio_pci_dma_buf_cleanup() assumed all VFIO device DMABUFs need to be
revoked. However, if vfio_pci_dma_buf_move() revokes DMABUFs before
the fd/device closes, then vfio_pci_dma_buf_cleanup() would do a
second/underflowing kref_put() then wait_for_completion() on a
completion that never fires. Fixed by predicating on revocation
status.

This could happen if PCI_COMMAND_MEMORY is cleared before closing the
device fd (but the scenario is more likely to hit when future commits
add more methods to revoke DMABUFs).

Fixes: 1a8a5227f2299 ("vfio: Wait for dma-buf invalidation to complete")
Signed-off-by: Matt Evans <mattev@xxxxxxxx>
---

(Just a fix, but later "vfio/pci: Convert BAR mmap() to use a DMABUF"
and "vfio/pci: Permanently revoke a DMABUF on request" depend on this
context, so including in this series.)

We really need a fix for this split out from this series, It's already
been shown[1] that this is trivially reachable. Carlos proposed[2] a
similar solution to the one below. I was concurrently working on the
issued and suggested an alternative[3]. Let's pick a solution for
7.1-rc. Thanks,

It looks like [3] is progressing, so I'll drop this one when I can rebase onto it.

I noticed [3] removes the dma_resv_lock(priv->dmabuf->resv) around the priv->vdev = NULL, and this series' vfio_pci_mmap_huge_fault() relies on vdev only changing whilst resv is held to resolve a race between a fault and cleanup (see patch 7 of this series). The handler takes resv so that it can stably test vdev in order to take memory_lock.

Must your fix change vdev outside of holding resv? I'm still sketching alternatives; at first glance perhaps the fault handler could rely on vdev being valid if !revoked, which can be tested holding [only] resv.


Thanks,

Matt


Alex

[1]https://lore.kernel.org/all/GVXPR02MB12019AA6014F27EF5D773E89BFB372@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx/
[2]https://lore.kernel.org/all/20260429182736.409323-2-clopez@xxxxxxx/
[3]https://lore.kernel.org/all/20260429142242.70f746b4@xxxxxxxxxx/

drivers/vfio/pci/vfio_pci_dmabuf.c | 9 +++++++--
1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/drivers/vfio/pci/vfio_pci_dmabuf.c b/drivers/vfio/pci/vfio_pci_dmabuf.c
index 281ba7d69567..04478b7415a0 100644
--- a/drivers/vfio/pci/vfio_pci_dmabuf.c
+++ b/drivers/vfio/pci/vfio_pci_dmabuf.c
@@ -395,20 +395,25 @@ void vfio_pci_dma_buf_cleanup(struct vfio_pci_core_device *vdev)
down_write(&vdev->memory_lock);
list_for_each_entry_safe(priv, tmp, &vdev->dmabufs, dmabufs_elm) {
+ bool was_revoked;
+
if (!get_file_active(&priv->dmabuf->file))
continue;
dma_resv_lock(priv->dmabuf->resv, NULL);
list_del_init(&priv->dmabufs_elm);
priv->vdev = NULL;
+ was_revoked = priv->revoked;
priv->revoked = true;
dma_buf_invalidate_mappings(priv->dmabuf);
dma_resv_wait_timeout(priv->dmabuf->resv,
DMA_RESV_USAGE_BOOKKEEP, false,
MAX_SCHEDULE_TIMEOUT);
dma_resv_unlock(priv->dmabuf->resv);
- kref_put(&priv->kref, vfio_pci_dma_buf_done);
- wait_for_completion(&priv->comp);
+ if (!was_revoked) {
+ kref_put(&priv->kref, vfio_pci_dma_buf_done);
+ wait_for_completion(&priv->comp);
+ }
vfio_device_put_registration(&vdev->vdev);
fput(priv->dmabuf->file);
}