Re: [PATCH v2 13/29] nios2: DMA mapping API

From: Ley Foon Tan
Date: Thu Jul 24 2014 - 07:37:20 EST


On Tue, Jul 15, 2014 at 5:38 PM, Arnd Bergmann <arnd@xxxxxxxx> wrote:
> On Tuesday 15 July 2014 16:45:40 Ley Foon Tan wrote:
>
>> +static inline void __dma_sync(void *vaddr, size_t size,
>> + enum dma_data_direction direction)
>> +{
>> + switch (direction) {
>> + case DMA_FROM_DEVICE: /* invalidate cache */
>> + invalidate_dcache_range((unsigned long)vaddr,
>> + (unsigned long)(vaddr + size));
>> + break;
>> + case DMA_TO_DEVICE: /* flush and invalidate cache */
>> + case DMA_BIDIRECTIONAL:
>> + flush_dcache_range((unsigned long)vaddr,
>> + (unsigned long)(vaddr + size));
>> + break;
>> + default:
>> + BUG();
>> + }
>> +}
>
> This seems strange. More on that below.
>
>> +#define dma_alloc_noncoherent(d, s, h, f) dma_alloc_coherent(d, s, h, f)
>> +#define dma_free_noncoherent(d, s, v, h) dma_free_coherent(d, s, v, h)
>> +
> ...
>> +static inline void dma_cache_sync(struct device *dev, void *vaddr, size_t size,
>> + enum dma_data_direction direction)
>> +{
>> + __dma_sync(vaddr, size, direction);
>> +}
>
> IIRC dma_cache_sync should be empty if you define dma_alloc_noncoherent
> to be the same as dma_alloc_coherent: It's already coherent, so no sync
> should be needed. What does the CPU do if you try to invalidate the cache
> on a coherent mapping?
Okay, I got what you mean here. I will leave this dma_cache_sync()
function empty.
The CPU just do nothing if we try to invalidate cache on a coherent region.
BTW, I found many other architectures still provide dma_cache_sync()
even they define dma_alloc_noncoherent
same as dma_alloc_coherent. Eg: blackfin, x86 or xtense.
>
>> +void dma_sync_single_for_cpu(struct device *dev, dma_addr_t dma_handle,
>> + size_t size, enum dma_data_direction direction)
>> +{
>> + BUG_ON(!valid_dma_direction(direction));
>> +
>> + __dma_sync(phys_to_virt(dma_handle), size, direction);
>> +}
>> +EXPORT_SYMBOL(dma_sync_single_for_cpu);
>> +
>> +void dma_sync_single_for_device(struct device *dev, dma_addr_t dma_handle,
>> + size_t size, enum dma_data_direction direction)
>> +{
>> + BUG_ON(!valid_dma_direction(direction));
>> +
>> + __dma_sync(phys_to_virt(dma_handle), size, direction);
>> +}
>> +EXPORT_SYMBOL(dma_sync_single_for_device);
>
> More importantly: you do the same operation for both _for_cpu and _for_device.
> I assume your CPU can never do speculative cache prefetches, so it's not
> incorrect, but you do twice the number of invalidations and flushes that
> you need.
>
> Why would you do anything for _for_cpu here?
I am a bit confused for _for_cpu and _for_device here. I found some
architectures like c6x and hexagon have same operation for both
_for_cpu and _for_device as well.
I have spent some times look at other architectures and below is what
I found. Please correct me if I am wrong, especially
for_device():DMA_FROM_DEVICE.

_for_cpu():
case DMA_BIDIRECTIONAL:
case DMA_FROM_DEVICE:
/* invalidate cache */
break;
case DMA_TO_DEVICE:
/* do nothing */
break;

-------------------------
_for_device():
case DMA_BIDIRECTIONAL:
case DMA_TO_DEVICE:
/* flush and invalidate cache */
break;
case DMA_FROM_DEVICE:
/* should we invalidate cache or do nothing? */
break;

Thanks for review.

Regards
Ley Foon
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/