Re: [PATCH v4 6/7] mtd: nand: omap2: Fix high memory dma prefetch transfer

From: Franklin S Cooper Jr.
Date: Wed Apr 13 2016 - 17:12:08 EST




On 04/13/2016 03:24 PM, Boris Brezillon wrote:
> Hi Franklin,
>
> On Wed, 13 Apr 2016 15:08:12 -0500
> "Franklin S Cooper Jr." <fcooper@xxxxxx> wrote:
>
>>
>>
>> On 03/21/2016 10:04 AM, Boris Brezillon wrote:
>>> Hi Franklin,
>>>
>>> On Thu, 10 Mar 2016 17:56:42 -0600
>>> Franklin S Cooper Jr <fcooper@xxxxxx> wrote:
>>>
>>>> Based on DMA documentation and testing using high memory buffer when
>>>> doing dma transfers can lead to various issues including kernel
>>>> panics.
>>>
>>> I guess it all comes from the vmalloced buffer case, which are not
>>> guaranteed to be physically contiguous (one of the DMA requirement,
>>> unless you have an iommu).
>>>
>>>>
>>>> To workaround this simply use cpu copy. The amount of high memory
>>>> buffers used are very uncommon so no noticeable performance hit should
>>>> be seen.
>>>
>>> Hm, that's not necessarily true. UBI and UBIFS allocate their buffers
>>> using vmalloc (vmalloced buffers fall in the high_memory region), and
>>> those are likely to be dis-contiguous if you have NANDs with pages > 4k.
>>>
>>> I recently posted patches to ease sg_table creation from any kind of
>>> virtual address [1][2]. Can you try them and let me know if it fixes
>>> your problem?
>>
>> It looks like you won't be going forward with your patchset based on
>> this thread [1].
>
> Nope. According to Russell it's unsafe to do that.
>
>> I can probably reword the patch description to avoid
>> implying that it is uncommon to run into high mem buffers. Also DMA with
>> NAND prefetch suffers from a reduction of performance compared to CPU
>> polling with prefetch. This is largely due to the significant over head
>> required to read such a small amount of data at a time. The
>> optimizations I've worked on all revolved around reducing the cycles
>> spent before executing the DMA request. Trying to make a high memory
>> buffer able to be used by the DMA adds significant amount of cycles and
>> your better off just using the cpu for performance reasons.
>
> Okay.
> One comment though, why not using virt_addr_valid() instead of
> addr >= high_memory here?


I had no reason other than simply using the approach used in the driver
already. Virt_addr_valid looks like it will work so I'll make the switch
after testing it.
>
> Best Regards,
>
> Boris
>
>