Re: [PATCH] arm: dma: fix sharing of coherent DMA memory without struct page

From: Russell King - ARM Linux
Date: Wed Apr 05 2017 - 19:17:53 EST


On Wed, Apr 05, 2017 at 10:02:42AM -0600, Shuah Khan wrote:
> When coherent DMA memory without struct page is shared, importer
> fails to find the page and runs into kernel page fault when it
> tries to dmabuf_ops_attach/map_sg/map_page the invalid page found
> in the sg_table. Please see www.spinics.net/lists/stable/msg164204.html
> for more information on this problem.
>
> This solution allows coherent DMA memory without struct page to be
> shared by providing a way for the exporter to tag the DMA buffer as
> a special buffer without struct page association and passing the
> information in sg_table to the importer. This information is used
> in attach/map_sg to avoid cleaning D-cache and mapping.
>
> The details of the change are:
>
> Framework:
> - Add a new dma_attrs field to struct scatterlist.
> - Add a new DMA_ATTR_DEV_COHERENT_NOPAGE attribute to clearly identify
> Coherent memory without struct page.
> - Add a new dma_check_dev_coherent() interface to check if memory is
> the device coherent area. There is no way to tell where the memory
> returned by dma_alloc_attrs() came from.
>
> Exporter logic:
> - Add logic to vb2_dc_alloc() to call dma_check_dev_coherent() and set
> DMA_ATTR_DEV_COHERENT_NOPAGE based the results of the check. This is
> done in the exporter context.
> - Add logic to arm_dma_get_sgtable() to identify memory without struct
> page using DMA_ATTR_DEV_COHERENT_NOPAGE attribute. If this attr is
> set, arm_dma_get_sgtable() will set page as the cpu_addr and update
> dma_address and dma_attrs fields in struct scatterlist for this sgl.
> This is done in exporter context when buffer is exported. With this

This sentence appears to just end...

I'm not convinced that coherent allocations should be setting the "page"
of a scatterlist to anything that isn't a real struct page or NULL. It
is, after all, an error to look up the virtual address etc of the
scatterlist entry or kmap it when it isn't backed by a struct page.

I'm actually already passing non-page backed memory through the DMA API
in armada-drm, although not entirely correctly, and etnaviv handles it
fine:

} else if (dobj->linear) {
/* Single contiguous physical region - no struct page */
if (sg_alloc_table(sgt, 1, GFP_KERNEL))
goto free_sgt;
sg_dma_address(sgt->sgl) = dobj->dev_addr;
sg_dma_len(sgt->sgl) = dobj->obj.size;

This is not quite correct, as it assumes (which is safe for it currently)
that the DMA address is the same on all devices. On Dove, which is where
this is used, that is the case, but it's not true elsewhere. Also note
that I'm avoid calling dma_map_sg() and dma_unmap_sg() - there's no iommus
to be considered.

I'd suggest that this follows the same pattern - setting the DMA address
(more appropriately for generic code) and the DMA length, while leaving
the virtual address members NULL/0. However, there's also the
complication of setting up any IOMMUs that would be necessary. I haven't
looked at that, or how it could work.

I also think this should be documented in the dmabuf API that it can
pass such scatterlists that are DMA-parameter only.

Lastly, I'd recommend that anything using this does _not_ provide
functional kmap/kmap_atomic support for these - kmap and kmap_atomic
are both out because there's no struct page anyway (and their use would
probably oops the kernel in this scenario.) I avoided mmap support in
armada drm, but if there's a pressing reason and real use case for the
importer to mmap() the buffers in userspace, it's something I could be
convinced of.

What I'm quite certain of is that we do _not_ want to be passing
coherent memory allocations into the streaming DMA API, not even with
a special attribute. The DMA API is about gaining coherent memory
(shared ownership of the buffer), or mapping system memory to a
specified device (which can imply unique ownership.) Trying to mix
the two together muddies the separation that we have there, and makes
it harder to explain. As can be seen from this patch, we'd end up
needing to add this special DMA_ATTR_DEV_COHERENT_NOPAGE everywhere,
which is added complexity on top of stuff that is not required for
this circumstance.

I can see why you're doing it, to avoid having to duplicate more of
the generic code in drm_prime, but I don't think plasting over this
problem in arch code by adding this special flag is a particularly
good way forward.

--
RMK's Patch system: http://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up
according to speedtest.net.