Re: [Linaro-mm-sig] [PATCH 4/8] dma-buf: add peer2peer flag

From: Alex Deucher
Date: Wed Apr 25 2018 - 02:11:07 EST


On Wed, Apr 25, 2018 at 1:48 AM, Christoph Hellwig <hch@xxxxxxxxxxxxx> wrote:
> On Tue, Apr 24, 2018 at 09:32:20PM +0200, Daniel Vetter wrote:
>> Out of curiosity, how much virtual flushing stuff is there still out
>> there? At least in drm we've pretty much ignore this, and seem to be
>> getting away without a huge uproar (at least from driver developers
>> and users, core folks are less amused about that).
>
> As I've just been wading through the code, the following architectures
> have non-coherent dma that flushes by virtual address for at least some
> platforms:
>
> - arm [1], arm64, hexagon, nds32, nios2, parisc, sh, xtensa, mips,
> powerpc
>
> These have non-coherent dma ops that flush by physical address:
>
> - arc, arm [1], c6x, m68k, microblaze, openrisc, sparc
>
> And these do not have non-coherent dma ops at all:
>
> - alpha, h8300, riscv, unicore32, x86
>
> [1] arm Ñeems to do both virtually and physically based ops, further
> audit needed.
>
> Note that using virtual addresses in the cache flushing interface
> doesn't mean that the cache actually is virtually indexed, but it at
> least allows for the possibility.
>
>> > I think the most important thing about such a buffer object is that
>> > it can distinguish the underlying mapping types. While
>> > dma_alloc_coherent, dma_alloc_attrs with DMA_ATTR_NON_CONSISTENT,
>> > dma_map_page/dma_map_single/dma_map_sg and dma_map_resource all give
>> > back a dma_addr_t they are in now way interchangable. And trying to
>> > stuff them all into a structure like struct scatterlist that has
>> > no indication what kind of mapping you are dealing with is just
>> > asking for trouble.
>>
>> Well the idea was to have 1 interface to allow all drivers to share
>> buffers with anything else, no matter how exactly they're allocated.
>
> Isn't that interface supposed to be dmabuf? Currently dma_map leaks
> a scatterlist through the sg_table in dma_buf_map_attachment /
> ->map_dma_buf, but looking at a few of the callers it seems like they
> really do not even want a scatterlist to start with, but check that
> is contains a physically contiguous range first. So kicking the
> scatterlist our there will probably improve the interface in general.
>
>> dma-buf has all the functions for flushing, so you can have coherent
>> mappings, non-coherent mappings and pretty much anything else. Or well
>> could, because in practice people hack up layering violations until it
>> works for the 2-3 drivers they care about. On top of that there's the
>> small issue that x86 insists that dma is coherent (and that's true for
>> most devices, including v4l drivers you might want to share stuff
>> with), and gpus really, really really do want to make almost
>> everything incoherent.
>
> How do discrete GPUs manage to be incoherent when attached over PCIe?

They can do CPU cache snooped (coherent) or non-snooped (incoherent)
DMA. Also for things like APUs, they show up as a PCIe device, but
that actual GPU core is part of the same die as the CPU and they have
their own special paths to memory, etc. The fact that they show up as
PCIe devices is mostly for enumeration purposes.

Alex

> _______________________________________________
> Linaro-mm-sig mailing list
> Linaro-mm-sig@xxxxxxxxxxxxxxxx
> https://lists.linaro.org/mailman/listinfo/linaro-mm-sig