Re: [PATCH v3 1/5] x86: make dma_alloc_coherent() return zeroed memory if CMA is enabled

From: Akinobu Mita
Date: Thu Apr 17 2014 - 11:41:01 EST


2014-04-17 4:44 GMT+09:00 Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx>:
> On Tue, 15 Apr 2014 22:08:45 +0900 Akinobu Mita <akinobu.mita@xxxxxxxxx> wrote:
>
>> Calling dma_alloc_coherent() with __GFP_ZERO must return zeroed memory.
>>
>> But when the contiguous memory allocator (CMA) is enabled on x86 and
>> the memory region is allocated by dma_alloc_from_contiguous(), it
>> doesn't return zeroed memory. Because dma_generic_alloc_coherent()
>> forgot to fill the memory region with zero if it was allocated by
>> dma_alloc_from_contiguous()
>>
>> Most implementations of dma_alloc_coherent() return zeroed memory
>> regardless of whether __GFP_ZERO is specified. So this fixes it by
>> unconditionally zeroing the allocated memory region.
>>
>> Alternatively, we could fix dma_alloc_from_contiguous() to return
>> zeroed out memory and remove memset() from all caller of it. But we
>> can't simply remove the memset on arm because __dma_clear_buffer() is
>> used there for ensuring cache flushing and it is used in many places.
>> Of course we can do redundant memset in dma_alloc_from_contiguous(),
>> but I think this patch is less impact for fixing this problem.
>
> But this patch does a duplicated memset if the page was allocated by
> alloc_pages_node()?

You're right. Clearing __GFP_ZERO bit in gfp flags before allocating by
alloc_pages_node() can fix this duplicated memset.

> Would it not be better to pass the gfp_t to dma_alloc_from_contiguous()
> and have it implement __GFP_ZERO? That will fix thsi inefficiency,
> will be symmetrical with the other underlying allocators and should
> permit the appropriate fixups in arm?

Sounds good. If it also handles __GFP_WAIT, We can remove __GFP_WAIT
check which is almost always required before calling
dma_alloc_from_contiguous().

>> --- a/arch/x86/kernel/pci-dma.c
>> +++ b/arch/x86/kernel/pci-dma.c
>> @@ -97,7 +97,6 @@ void *dma_generic_alloc_coherent(struct device *dev, size_t size,
>>
>> dma_mask = dma_alloc_coherent_mask(dev, flag);
>>
>> - flag |= __GFP_ZERO;

I'll soon prepare a follow-up patch to clear __GFP_ZERO like

+ flag &= ~__GFP_ZERO

>> again:
>> page = NULL;
>> /* CMA can be used only in the context which permits sleeping */
>> @@ -120,7 +119,7 @@ again:
>>
>> return NULL;
>> }
>> -
>> + memset(page_address(page), 0, size);
>> *dma_addr = addr;
>> return page_address(page);
>> }
>> --
>> 1.8.3.2
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/