Re: [PATCH] dma-direct: Set SG_DMA_SWIOTLB flag for dma-direct

From: Robin Murphy
Date: Thu May 09 2024 - 09:28:46 EST


On 04/05/2024 9:53 am, Petr Tesařík wrote:
On Fri, 3 May 2024 18:37:12 +0000
"T.J. Mercier" <tjmercier@xxxxxxxxxx> wrote:

As of commit 861370f49ce4 ("iommu/dma: force bouncing if the size is
not cacheline-aligned") sg_dma_mark_swiotlb is called when
dma_map_sgtable takes the IOMMU path and uses SWIOTLB for some portion
of a scatterlist. It is never set for the direct path, so drivers
cannot always rely on sg_dma_is_swiotlb to return correctly after
calling dma_map_sgtable. Fix this by calling sg_dma_mark_swiotlb in the
direct path like it is in the IOMMU path.

Signed-off-by: T.J. Mercier <tjmercier@xxxxxxxxxx>
---
kernel/dma/direct.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/kernel/dma/direct.c b/kernel/dma/direct.c
index 4d543b1e9d57..52f0dcb25ca2 100644
--- a/kernel/dma/direct.c
+++ b/kernel/dma/direct.c
@@ -12,7 +12,7 @@
#include <linux/pfn.h>
#include <linux/vmalloc.h>
#include <linux/set_memory.h>
-#include <linux/slab.h>
+#include <linux/swiotlb.h>
#include "direct.h"
/*
@@ -497,6 +497,8 @@ int dma_direct_map_sg(struct device *dev, struct scatterlist *sgl, int nents,
goto out_unmap;
}
sg_dma_len(sg) = sg->length;
+ if (is_swiotlb_buffer(dev, dma_to_phys(dev, sg->dma_address)))
+ sg_dma_mark_swiotlb(sg);
}
return nents;

I'm not sure this does the right thing. IIUC when the scatterlist flags
include SG_DMA_SWIOTLB, iommu_dma_sync_sg_for_*() will call
iommu_dma_sync_single_for_*(), which in turn translates the DMA address
to a physical address using iommu_iova_to_phys(). It seems to me that
this function may not work correctly if there is no IOMMU, but it also
seems to me that the scatterlist may contain such non-IOMMU addresses.

In principle dma-direct *could* make use of the SG_DMA_SWIOTLB flag for an ever-so-slightly cheaper check than is_swiotlb_buffer() in sync_sg and unmap_sg, the same way as iommu-dma does. However the benefit would be a lot less significant than for iommu-dma, where it's really about the overhead of needing to perform iommu_iova_to_phys() translations for every segment every time in order to *get* the right thing to check is_swiotlb_buffer() on - that's what would be unreasonably prohibitive otherwise.

Thanks,
Robin