dma_sync_*_for_cpu and direction=TO_DEVICE (was Re: [PATCH 02/20] dma-mapping: provide a generic dma-noncoherent implementation)

From: Vineet Gupta
Date: Fri May 18 2018 - 12:24:13 EST


On 05/18/2018 06:11 AM, Alexey Brodkin wrote:
void arch_sync_dma_for_device(struct device *dev, phys_addr_t paddr,
size_t size, enum dma_data_direction dir)
{
+ if (dir != DMA_TO_DEVICE){
+ dump_stack();
+ printk(" *** %s@%d: DMA direction is %s instead of %s\n",
+ __func__, __LINE__, dir_to_str(dir), dir_to_str(DMA_TO_DEVICE));
+ }
+
return _dma_cache_sync(dev, paddr, size, dir);
}
void arch_sync_dma_for_cpu(struct device *dev, phys_addr_t paddr,
size_t size, enum dma_data_direction dir)
{
+ if (dir != DMA_FROM_DEVICE) {
+ dump_stack();
+ printk(" *** %s@%d: DMA direction is %s instead of %s\n",
+ __func__, __LINE__, dir_to_str(dir), dir_to_str(DMA_FROM_DEVICE));
+ }
+
return _dma_cache_sync(dev, paddr, size, dir);
}

...
In case of MMC/DW_MCI (AKA DesignWare MobileStorage controller) that's an execution flow:
1) __dw_mci_start_request()
2) dw_mci_pre_dma_transfer()
3) dma_map_sg(..., mmc_get_dma_dir(data))

Note mmc_get_dma_dir() is just "data->flags & MMC_DATA_WRITE ? DMA_TO_DEVICE : DMA_FROM_DEVICE".
I.e. if we're preparing for sending data dma_noncoherent_map_sg() will have DMA_TO_DEVICE which
is quite OK for passing to dma_noncoherent_sync_sg_for_device() but in case of reading we'll have
DMA_FROM_DEVICE which we'll pass to dma_noncoherent_sync_sg_for_device() in dma_noncoherent_map_sg().

I'd say this is not entirely correct because IMHO arch_sync_dma_for_cpu() is supposed to only be used
in case of DMA_FROM_DEVICE and arch_sync_dma_for_device() only in case of DMA_TO_DEVICE.

So roughly 10 years ago, some kernel rookie name Vineet Gupta, asked the exact same question :-)

http://kernelnewbies.kernelnewbies.narkive.com/aGW1QcDv/query-about-dma-sync-for-cpu-and-direction-to-device

I never understood the need for this direction. And if memory serves me right, at that time I was seeing twice the amount of cache flushing !

But the real fix of my problem is:
---------------------------------------->8------------------------------------
--- a/lib/dma-noncoherent.c
+++ b/lib/dma-noncoherent.c
@@ -35,7 +35,7 @@ static dma_addr_t dma_noncoherent_map_page(struct device *dev, struct page *page
addr = dma_direct_map_page(dev, page, offset, size, dir, attrs);
if (!dma_mapping_error(dev, addr) && !(attrs & DMA_ATTR_SKIP_CPU_SYNC))
- arch_sync_dma_for_device(dev, page_to_phys(page), size, dir);
+ arch_sync_dma_for_device(dev, page_to_phys(page) + offset, size, dir);
return addr;
}
---------------------------------------->8------------------------------------

You seem to lost an offset in the page so if we happen to have a buffer not aligned to
a page boundary then we were obviously corrupting data outside our data :)

Neat !

--
To unsubscribe from this list: send the line "unsubscribe linux-alpha" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html