Re: [PATCH v1 1/2] dma: return 0 from dma_opt_mapping_size() when no real hint exists

From: Robin Murphy

Date: Tue Mar 17 2026 - 05:50:38 EST


On 2026-03-16 8:39 pm, Ionut Nechita (Wind River) wrote:
From: Ionut Nechita <ionut.nechita@xxxxxxxxxxxxx>

dma_opt_mapping_size() currently initializes its local size to SIZE_MAX
and, when neither an IOMMU nor a DMA ops opt_mapping_size callback is
present, returns min(dma_max_mapping_size(dev), SIZE_MAX). That value
is a large but finite number that has nothing to do with an optimal
transfer size — it is simply the maximum the DMA layer can map.

No, the current code is correct. dma_opt_mapping_size() represents the largest size that can be mapped without incurring any significant performance penalty (compared to smaller sizes). If the implementation has no such restriction, then the largest "efficient" size is quite obviously just the largest size in total.

Callers such as scsi_transport_sas treat the return value as a genuine
optimization hint and propagate it into Scsi_Host.opt_sectors, which in
turn becomes the block device's optimal_io_size. On SAS controllers
like mpt3sas running with IOMMU in passthrough mode the bogus value
(max_sectors << 9 = 16776704, rounded to 16773120) reaches mkfs.xfs,
which computes swidth=4095 and sunit=2. Because 4095 is not a multiple
of 2, XFS rejects the geometry with "SB stripe unit sanity check
failed", making it impossible to create filesystems during system
bootstrap.

And that is obviously a bug. There has never been any guarantee offered about the values returned by either dma_max_mapping_size() or dma_opt_mapping_size() - they could be very large, very small, and certainly do not have to be powers of 2. Say an implementation has some internal data size optimisation that makes U32_MAX its largest "efficient" size, it's free to return that, and then you'll still have the same bug regardless of this bodge.

Fix the actual bug, don't break common code in an attempt to paper over it that doesn't even achieve that very well.

Thanks,
Robin.

Fix this by returning 0 when no backend provides an optimal mapping size
hint. A return value of 0 unambiguously means "no preference" and lets
callers that use min() or min_not_zero() do the right thing without
special-casing.

The only other in-tree caller (nvme-pci) is adjusted in the next patch.

Fixes: a229cc14f339 ("dma-mapping: add dma_opt_mapping_size()")
Cc: stable@xxxxxxxxxxxxxxx
Signed-off-by: Ionut Nechita <ionut.nechita@xxxxxxxxxxxxx>
---
kernel/dma/mapping.c | 13 ++++++++-----
1 file changed, 8 insertions(+), 5 deletions(-)

diff --git a/kernel/dma/mapping.c b/kernel/dma/mapping.c
index 78d8b4039c3e6..fffa6a3f191a3 100644
--- a/kernel/dma/mapping.c
+++ b/kernel/dma/mapping.c
@@ -984,14 +984,17 @@ EXPORT_SYMBOL_GPL(dma_max_mapping_size);
size_t dma_opt_mapping_size(struct device *dev)
{
const struct dma_map_ops *ops = get_dma_ops(dev);
- size_t size = SIZE_MAX;
if (use_dma_iommu(dev))
- size = iommu_dma_opt_mapping_size();
- else if (ops && ops->opt_mapping_size)
- size = ops->opt_mapping_size();
+ return iommu_dma_opt_mapping_size();
+ if (ops && ops->opt_mapping_size)
+ return ops->opt_mapping_size();
- return min(dma_max_mapping_size(dev), size);
+ /*
+ * No backend provided an optimal size hint. Return 0 so that
+ * callers can distinguish "no hint" from a real value.
+ */
+ return 0;
}
EXPORT_SYMBOL_GPL(dma_opt_mapping_size);