Re: revert dma direct internals abuse

From: Thomas Hellstrom
Date: Tue Apr 09 2019 - 10:17:52 EST


On Tue, 2019-04-09 at 15:31 +0200, hch@xxxxxx wrote:
> On Tue, Apr 09, 2019 at 01:04:51PM +0000, Thomas Hellstrom wrote:
> > On the VMware platform we have two possible vIOMMUS, the AMD iommu
> > and
> > Intel VTD, Given those conditions I belive the patch is
> > functionally
> > correct. We can't cover the AMD case with intel_iommu_enabled.
> > Furthermore the only form of incoherency that can affect our
> > graphics
> > device is someone forcing SWIOTLB in which case that person would
> > be
> > happier with software rendering. In any case, observing the fact
> > that
> > the direct_ops are not used makes sure that SWIOTLB is not used.
> > Knowing that we're on the VMware platform, we're coherent and can
> > safely have the dma layer do dma address translation for us. All
> > this
> > information was not explicilty written in the changelog, no.
>
> We have a series pending that might bounce your buffers even when
> using the Intel IOMMU, which should eventually also find its way
> to other IOMMUs:
>
>
> https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.linuxfoundation.org%2Fpipermail%2Fiommu%2F2019-March%2F034090.html&data=02%7C01%7Cthellstrom%40vmware.com%7C9933ee7b805842607ea908d6bcefc505%7Cb39138ca3cee4b4aa4d6cd83d9dd62f0%7C0%7C0%7C636904135345010687&sdata=Px500%2B1FjL%2FZLedUdbAXz4a%2BT5DaZBFf6wnesTyFvZY%3D&reserved=0

If that's the case, I think most of the graphics drivers will stop
functioning. I don't think people would want that, and even if the
graphics drivers are "to blame" due to not implementing the sync calls,
I think the work involved to get things right is impressive if at all
possible.

>
> > In any case, assuming that that patch is reverted due to the
> > layering
> > violation, Are you willing to help out with a small API to detect
> > the
> > situation where streaming DMA is incoherent?
>
> The short but sad answer is that we can't ever guarantee that you
> can skip the dma_*sync_* calls. There are too many factors in play
> that might require it at any time - working around unaligned
> addresses
> in iommus, CPUs that are coherent for some device and not some,
> addressing
> limitations both in physical CPUs and VMs (see the various "secure
> VM"
> concepts floating around at the moment).
>
> If you want to avoid the dma_*sync_* calls you must use
> dma_alloc_coherent to allocate the memory. Note that the memory for
> dma_alloc_coherent actually comes from the normal page pool most of
> the time, and for certain on x86, which seems to be what you care
> about. The times of it dipping into the tiny swiotlb pool are long
> gone. So at least for you I see absolutely no reason to not simply
> always use dma_alloc_coherent to start with. For other uses that
> involve platforms without DMA coherent devices like arm the tradeoffs
> might be a little different.

There are two things that concerns me with dma_alloc_coherent:

1) It seems to want pages mapped either in the kernel map or vmapped.
Graphics drivers allocate huge amounts of memory, Typically up to 50%
of system memory or more. In a 32 bit PAE system I'm afraid of running
out of vmap space as well as not being able to allocate as much memory
as I want. Perhaps a dma_alloc_coherent() interface that returns a page
rather than a virtual address would do the trick.

2) Exporting using dma-buf. A page allocated using dma_alloc_coherent()
for one device might not be coherent for another device. What happens
if I allocate a page using dma_alloc_coherent() for device 1 and then
want to map it using dma_map_page() for device 2?

Thanks,
Thomas