[PATCH AUTOSEL 6.19] block: fix partial IOVA mapping cleanup in blk_rq_dma_map_iova
From: Sasha Levin
Date: Wed Feb 18 2026 - 21:10:24 EST
From: Chaitanya Kulkarni <kch@xxxxxxxxxx>
[ Upstream commit 81e7223b1a2d63b655ee72577c8579f968d037e3 ]
When dma_iova_link() fails partway through mapping a request's bvec
list, the function breaks out of the loop without cleaning up
already mapped segments. Similarly, if dma_iova_sync() fails after
linking all segments, no cleanup is performed.
This leaves partial IOVA mappings in place. The completion path
attempts to unmap the full expected size via dma_iova_destroy() or
nvme_unmap_data(), but only a partial size was actually mapped,
leading to incorrect unmap operations.
Add an out_unlink error path that calls dma_iova_destroy() to clean
up partial mappings before returning failure. The dma_iova_destroy()
function handles both partial unlink and IOVA space freeing. It
correctly handles the mapped_len == 0 case (first dma_iova_link()
failure) by only freeing the IOVA allocation without attempting to
unmap.
Signed-off-by: Chaitanya Kulkarni <kch@xxxxxxxxxx>
Reviewed-by: Christoph Hellwig <hch@xxxxxx>
Signed-off-by: Jens Axboe <axboe@xxxxxxxxx>
Signed-off-by: Sasha Levin <sashal@xxxxxxxxxx>
---
LLM Generated explanations, may be completely bogus:
The function is called from `blk_dma_map_iter_start()` which is the main
DMA mapping entry point for block requests using IOMMU-based (IOVA)
mapping. This is used by NVMe drivers and potentially other high-
performance storage drivers.
## 3. Summary of Analysis
### What the bug is:
In `blk_rq_dma_map_iova()`, when `dma_iova_link()` fails partway through
mapping multiple segments, or when `dma_iova_sync()` fails after all
segments are linked:
1. **dma_iova_link() failure**: The code breaks out of the loop but
doesn't clean up already-linked segments. The IOVA allocation and
partial mappings are leaked. Additionally, when the `dma_iova_link()`
fails, the code falls through to `dma_iova_sync()` which then
operates on partially mapped data — this is also incorrect behavior.
2. **dma_iova_sync() failure**: The code returns `false` with
`iter->status` set, but doesn't call `dma_iova_destroy()` to clean up
the linked IOVA mappings.
In both cases, the completion path will attempt to unmap using the full
expected size via `dma_iova_destroy()` or `nvme_unmap_data()`, but only
a partial size was actually mapped, leading to **incorrect unmap
operations** — which could corrupt IOMMU mappings, cause IOMMU faults,
or lead to data corruption.
### Why it matters:
- This is a bug in the **block I/O DMA path** — the very core of how
storage I/O works with IOMMU
- It can trigger on any system using IOMMU with NVMe storage when memory
pressure or IOMMU resource exhaustion causes `dma_iova_link()` to fail
- Consequences of incorrect IOMMU unmap: potential data corruption,
IOMMU page faults, kernel crashes
- NVMe is extremely widely deployed; any system with IOMMU enabled could
be affected
### Stable criteria assessment:
- **Obviously correct**: Yes — adds proper error cleanup with
`dma_iova_destroy()` which is designed for exactly this purpose
- **Fixes a real bug**: Yes — partial IOVA mapping cleanup is missing,
leading to incorrect unmap operations
- **Small and contained**: Yes — the diff is minimal (~15 lines changed
in a single function in a single file)
- **No new features**: Correct — purely error path fix
- **Reviewed**: Yes — reviewed by Christoph Hellwig (original author of
the code), committed by Jens Axboe (block layer maintainer)
- **Tested in mainline**: Yes — it's in mainline already
### Risk assessment:
- **Very low risk**: The change only affects error paths, adding proper
cleanup where none existed
- **Well-understood cleanup function**: `dma_iova_destroy()` is
specifically designed for this cleanup and handles both partial and
zero-length cases
- **Backport note**: The patch won't apply cleanly to 6.17.y and 6.18.y
because the `attrs` variable was added in v6.19. A minor adaptation
would be needed (remove `attrs` and pass `0` to `dma_iova_destroy()`).
For 6.19.y, it should apply cleanly.
## Verification
- **git log** showed `blk-mq-dma.c` was created by commit
`b0a4158554b90` and the buggy function `blk_rq_dma_map_iova` was
introduced by `858299dc61603` ("block: add scatterlist-less DMA
mapping helpers"), first in v6.17
- **git tag --contains** confirmed the buggy code entered mainline in
v6.17
- **git show stable/linux-6.18.y:block/blk-mq-dma.c** and **git show
stable/linux-6.17.y:block/blk-mq-dma.c** both confirmed the buggy code
is present in stable trees (with `break` instead of `goto out_unlink`,
and no cleanup on `dma_iova_sync` failure)
- **git show stable/linux-6.12.y:block/blk-mq-dma.c** confirmed the file
does NOT exist in 6.12.y (only affects 6.17+)
- **Read of drivers/iommu/dma-iommu.c** confirmed `dma_iova_destroy()`
properly handles mapped_len == 0 by calling `dma_iova_free()` only,
and mapped_len > 0 by unlinking and freeing
- **git log stable/linux-6.18.y** and **stable/linux-6.19.y** confirmed
the fix has not yet been applied to stable trees
- **git log 37f0c7a8df7ad** confirmed the `attrs` variable was
introduced in v6.19, meaning 6.17.y and 6.18.y will need a trivial
adaptation for clean backport
- The reviewer (Christoph Hellwig) is verified as the original author of
the buggy code via the Signed-off-by on `858299dc61603`
**YES**
block/blk-mq-dma.c | 13 ++++++++-----
1 file changed, 8 insertions(+), 5 deletions(-)
diff --git a/block/blk-mq-dma.c b/block/blk-mq-dma.c
index fb018fffffdcc..feead1934301a 100644
--- a/block/blk-mq-dma.c
+++ b/block/blk-mq-dma.c
@@ -126,17 +126,20 @@ static bool blk_rq_dma_map_iova(struct request *req, struct device *dma_dev,
error = dma_iova_link(dma_dev, state, vec->paddr, mapped,
vec->len, dir, attrs);
if (error)
- break;
+ goto out_unlink;
mapped += vec->len;
} while (blk_map_iter_next(req, &iter->iter, vec));
error = dma_iova_sync(dma_dev, state, 0, mapped);
- if (error) {
- iter->status = errno_to_blk_status(error);
- return false;
- }
+ if (error)
+ goto out_unlink;
return true;
+
+out_unlink:
+ dma_iova_destroy(dma_dev, state, mapped, dir, attrs);
+ iter->status = errno_to_blk_status(error);
+ return false;
}
static inline void blk_rq_map_iter_init(struct request *rq,
--
2.51.0