Re: dma coherent memory user-space maps
From: Christoph Hellwig
Date: Thu Oct 31 2019 - 17:54:20 EST
Hi Thomas,
sorry for the delay. I've been travelling way to much laterly and had
a hard time keeping up.
On Tue, Oct 08, 2019 at 02:34:17PM +0200, Thomas Hellström (VMware) wrote:
> /* Obtain struct dma_pfn pointers from a dma coherent allocation */
> int dma_get_dpfns(struct device *dev, void *cpu_addr, dma_addr_t dma_addr,
> pgoff_t offset, pgoff_t num, dma_pfn_t dpfns[]);
>
> I figure, for most if not all architectures we could use an ordinary pfn as
> dma_pfn_t, but the dma layer would still have control over how those pfns
> are obtained and how they are used in the kernel's mapping APIs.
>
> If so, I could start looking at this, time permitting, for the cases where
> the pfn can be obtained from the kernel address or from
> arch_dma_coherent_to_pfn(), and also the needed work to have a tailored
> vmap_pfn().
I'm not sure that infrastructure is all that helpful unfortunately, even
if it ended up working. The problem with the 'coherent' DMA mappings
is that we they have a few different backends. For architectures that
are DMA coherent everything is easy and we use the normal page
allocator, and your above is trivially doable as wrappers around the
existing functionality. Other remap ptes to be uncached, either
in-place or using vmap, and the remaining ones use weird special
allocators for which almost everything we can mormally do in the VM
will fail.
I promised Christian an uncached DMA allocator a while ago, and still
haven't finished that either unfortunately. But based on looking at
the x86 pageattr code I'm now firmly down the road of using the
set_memory_* helpers that change the pte attributes in place, as
everything else can't actually work on x86 which doesn't allow
aliasing of PTEs with different caching attributes. The arm64 folks
also would prefer in-place remapping even if they don't support it
yet, and that is something the i915 code already does in a somewhat
hacky way, and something the msm drm driver wants. So I decided to
come up with an API that gives back 'coherent' pages on the
architectures that support it and otherwise just fail.
Do you care about architectures other than x86 and arm64? If not I'll
hopefully have something for you soon.