Re: [RFC] dma-mapping: fix dma_common_mmap() for ARC

From: Catalin Marinas
Date: Sun Oct 30 2016 - 17:27:05 EST


On Wed, Oct 26, 2016 at 10:22:44PM +0300, Alexey Brodkin wrote:
> ------------------------>8-----------------------
> arc_dma_alloc()
> ioremap_nocache() AKA ioremap()
> ioremap_prot()
> get_vm_area() + ioremap_page_range() on obtained vaddr
> ------------------------>8-----------------------
>
> As a result we get TLB entry of the following kind:
> ------------------------>8-----------------------
> vaddr = 0x7200_0000
> paddr = 0x8200_0000
> flags = _uncached_
> ------------------------>8-----------------------
>
> Kerenl thinks frame buffer is located @ 0x7200_0000 and uses it
> perfectly fine.
>
> But here comes a time for user-space application to request frame buffer
> to be mapped for it. That happens easily with the following call path:
> ------------------------>8-----------------------
> fb_mmap()
> drm_fb_cma_mmap()
> dma_mmap_writecombine() AKA dma_mmap_wc()
> dma_mmap_attrs()
> dma_common_mmap() since we don't [yet] have dma_map_ops.mmap()
> for ARC
> ------------------------>8-----------------------
>
> And in dma_common_mmap() we first calculate pfn of what we think is
> "physical page" and then do remap_pfn_range() to that "physical page".
>
> Here we're getting to the interesting thing - how pfn is calculated.
> As of now this is done as simple as:
> ------------------------>8-----------------------
> pfn = page_to_pfn(virt_to_page(cpu_addr));
> ------------------------>8-----------------------

The virt_to_page() function here only works for addresses in the kernel
linear map. In your case, the DMA buffer is mapped out of the ioremap
space, so the cpu_addr you pass in here would return the incorrect pfn
(as you've already noticed).

> Simplest fix for ARC is to use dma_addr instead because it matches
> real physical memory address and so mapping for user-space we're
> getting then is this:
> ------------------------>8-----------------------
> vaddr = 0x0200_0000
> paddr = 0x8200_0000
> flags = _uncached_
> ------------------------>8-----------------------
> And it works perfectly fine.

But it breaks the other architectures where dma_addr is actually closer
to the phys_addr than the kernel linear map.

> diff --git a/drivers/base/dma-mapping.c b/drivers/base/dma-mapping.c
> index 8f8b68c80986..16307eed453f 100644
> --- a/drivers/base/dma-mapping.c
> +++ b/drivers/base/dma-mapping.c
> @@ -252,7 +252,7 @@ int dma_common_mmap(struct device *dev, struct vm_area_struct *vma,
> #if defined(CONFIG_MMU) && !defined(CONFIG_ARCH_NO_COHERENT_DMA_MMAP)
> unsigned long user_count = vma_pages(vma);
> unsigned long count = PAGE_ALIGN(size) >> PAGE_SHIFT;
> - unsigned long pfn = page_to_pfn(virt_to_page(cpu_addr));
> + unsigned long pfn = page_to_pfn(virt_to_page(dma_addr));

As I said above, this is incorrect. I would suggest that you implement
an arc specific mmap operation. We do this for arm64 using
remap_pfn_range; see __swiotlb_mmap under arch/arm64/mm/dma-mapping.c
where the pfn is calculated using an arm64-specific dma_to_phys()
function.

--
Catalin