Re: [PATCH v1] udmabuf: Ensure to perform cache synchronisation in begin_cpu_udmabuf()

From: Mikhail Gavrilov

Date: Sat Jun 27 2026 - 07:11:43 EST


On Sat, Jun 27, 2026 at 3:58 PM Robert Mader <robert.mader@xxxxxxxxxxxxx> wrote:
>
> The message of commit 504e2b4ab97a ("dma-buf/udmabuf: skip redundant cpu sync to
> fix cacheline EEXIST warning") says:
>
> > The CPU sync at map/unmap time is also redundant for udmabuf:
> > begin_cpu_udmabuf() and end_cpu_udmabuf() already perform explicit
> > cache synchronization via dma_sync_sgtable_for_cpu/device() when CPU
> > access is requested through the dma-buf interface.
>
> This, however, does not apply to the first time begin_cpu_udmabuf() is
> called on an udmabuf, in which case the implementation previously relied on
> get_sg_table() to perform the cache synchronisation.
>
> Ensure to call dma_sync_sgtable_for_cpu() in that case as well.
>
> Fixes: 504e2b4ab97a ("dma-buf/udmabuf: skip redundant cpu sync to fix cacheline EEXIST warning")
> Signed-off-by: Robert Mader <robert.mader@xxxxxxxxxxxxx>
> ---
> drivers/dma-buf/udmabuf.c | 9 +++++----
> 1 file changed, 5 insertions(+), 4 deletions(-)
>
> diff --git a/drivers/dma-buf/udmabuf.c b/drivers/dma-buf/udmabuf.c
> index bced421c0d65..08f57bc1294d 100644
> --- a/drivers/dma-buf/udmabuf.c
> +++ b/drivers/dma-buf/udmabuf.c
> @@ -224,21 +224,22 @@ static int begin_cpu_udmabuf(struct dma_buf *buf,
> {
> struct udmabuf *ubuf = buf->priv;
> struct device *dev = ubuf->device->this_device;
> - int ret = 0;
>
> if (!ubuf->sg) {
> ubuf->sg = get_sg_table(dev, buf, direction);
> if (IS_ERR(ubuf->sg)) {
> + int ret;
> +
> ret = PTR_ERR(ubuf->sg);
> ubuf->sg = NULL;
> + return ret;
> } else {
> ubuf->sg_dir = direction;
> }
> - } else {
> - dma_sync_sgtable_for_cpu(dev, ubuf->sg, direction);
> }
>
> - return ret;
> + dma_sync_sgtable_for_cpu(dev, ubuf->sg, direction);
> + return 0;
> }
>
> static int end_cpu_udmabuf(struct dma_buf *buf,
> --
> 2.54.0
>

Hi Robert,

Thanks for catching this. You're right -- with DMA_ATTR_SKIP_CPU_SYNC
in get_sg_table(), the first begin_cpu_udmabuf() call no longer gets
the implicit CPU sync that the dma_map path used to provide, and the
explicit sync in the old else-branch only ran when ubuf->sg already
existed.

The fix correctly moves dma_sync_sgtable_for_cpu() so it runs in both
cases. The logic looks right to me.

This wouldn't reproduce on x86 since DMA is cache-coherent there
(dma_sync_* is a no-op), which is why I missed it -- it only matters
on non-coherent architectures. My apologies for the regression.

Reviewed-by: Mikhail Gavrilov <mikhail.v.gavrilov@xxxxxxxxx>

--
Thanks,
Mikhail