Re: [PATCH V8 00/14] mmc: Add Command Queue support

From: Adrian Hunter
Date: Tue Oct 10 2017 - 08:31:08 EST


On 10/10/17 15:12, Ulf Hansson wrote:
> On 21 September 2017 at 11:44, Adrian Hunter <adrian.hunter@xxxxxxxxx> wrote:
>> On 21/09/17 12:01, Ulf Hansson wrote:
>>> On 13 September 2017 at 13:40, Adrian Hunter <adrian.hunter@xxxxxxxxx> wrote:
>>>> Hi
>>>>
>>>> Here is V8 of the hardware command queue patches without the software
>>>> command queue patches, now using blk-mq and now with blk-mq support for
>>>> non-CQE I/O.
>>>>
>>>> After the unacceptable debacle of the last release cycle, I expect an
>>>> immediate response to these patches.
>>>>
>>>> HW CMDQ offers 25% - 50% better random multi-threaded I/O. I see a slight
>>>> 2% drop in sequential read speed but no change to sequential write.
>>>>
>>>> Non-CQE blk-mq showed a 3% decrease in sequential read performance. This
>>>> seemed to be coming from the inferior latency of running work items compared
>>>> with a dedicated thread. Hacking blk-mq workqueue to be unbound reduced the
>>>> performance degradation from 3% to 1%.
>>>>
>>>> While we should look at changing blk-mq to give better workqueue performance,
>>>> a bigger gain is likely to be made by adding a new host API to enable the
>>>> next already-prepared request to be issued directly from within ->done()
>>>> callback of the current request.
>>>
>>> Adrian, I am reviewing this series, however let me comment on each
>>> change individually.
>>>
>>> I have also run some test on my ux500 board and enabling the blkmq
>>> path via the new MMC Kconfig option. My idea was to run some iozone
>>> comparisons between the legacy path and the new blkmq path, but I just
>>> couldn't get to that point because of the following errors.
>>>
>>> I am using a Kingston 4GB SDHC card, which is detected and mounted
>>> nicely. However, when I decide to do some writes to the card I get the
>>> following errors.
>>>
>>> root@ME:/mnt/sdcard dd if=/dev/zero of=testfile bs=8192 count=5000 conv=fsync
>>> [ 463.714294] mmci-pl18x 80126000.sdi0_per1: error during DMA transfer!
>>> [ 464.722656] mmci-pl18x 80126000.sdi0_per1: error during DMA transfer!
>>> [ 466.081481] mmci-pl18x 80126000.sdi0_per1: error during DMA transfer!
>>> [ 467.111236] mmci-pl18x 80126000.sdi0_per1: error during DMA transfer!
>>> [ 468.669647] mmci-pl18x 80126000.sdi0_per1: error during DMA transfer!
>>> [ 469.685699] mmci-pl18x 80126000.sdi0_per1: error during DMA transfer!
>>> [ 471.043334] mmci-pl18x 80126000.sdi0_per1: error during DMA transfer!
>>> [ 472.052337] mmci-pl18x 80126000.sdi0_per1: error during DMA transfer!
>>> [ 473.342651] mmci-pl18x 80126000.sdi0_per1: error during DMA transfer!
>>> [ 474.323760] mmci-pl18x 80126000.sdi0_per1: error during DMA transfer!
>>> [ 475.544769] mmci-pl18x 80126000.sdi0_per1: error during DMA transfer!
>>> [ 476.539031] mmci-pl18x 80126000.sdi0_per1: error during DMA transfer!
>>> [ 477.748474] mmci-pl18x 80126000.sdi0_per1: error during DMA transfer!
>>> [ 478.724182] mmci-pl18x 80126000.sdi0_per1: error during DMA transfer!
>>>
>>> I haven't yet got the point of investigating this any further, and
>>> unfortunate I have a busy schedule with traveling next week. I will do
>>> my best to look into this as soon as I can.
>>>
>>> Perhaps you have some ideas?
>>
>> The behaviour depends on whether you have MMC_CAP_WAIT_WHILE_BUSY. Try
>> changing that and see if it makes a difference.
>
> Yes, it does! I disabled MMC_CAP_WAIT_WHILE_BUSY (and its
> corresponding code in mmci.c) and the errors goes away.
>
> When I use MMC_CAP_WAIT_WHILE_BUSY I get these problems:
>
> [ 223.820983] mmci-pl18x 80126000.sdi0_per1: error during DMA transfer!
> [ 224.815795] mmci-pl18x 80126000.sdi0_per1: error during DMA transfer!
> [ 226.034881] mmci-pl18x 80126000.sdi0_per1: error during DMA transfer!
> [ 227.112884] mmci-pl18x 80126000.sdi0_per1: error during DMA transfer!
> [ 227.220275] mmc0: Card stuck in wrong state! mmcblk0 mmc_blk_card_stuck
> [ 228.686798] mmci-pl18x 80126000.sdi0_per1: error during DMA transfer!
> [ 229.892150] mmci-pl18x 80126000.sdi0_per1: error during DMA transfer!
> [ 231.031890] mmci-pl18x 80126000.sdi0_per1: error during DMA transfer!
> [ 232.239013] mmci-pl18x 80126000.sdi0_per1: error during DMA transfer!
> 5000+0 records in
> 5000+0 records out
> root@ME:/mnt/sdcard
>
> I looked at the new blkmq code from patch v10 13/15. It seems like the
> MMC_CAP_WAIT_WHILE_BUSY is used to determine whether the async request
> mechanism should be used or not. Perhaps I didn't looked close enough,
> but maybe you could elaborate on why this seems to be the case!?

MMC_CAP_WAIT_WHILE_BUSY is necessary because it means that a data transfer
request has finished when the host controller calls mmc_request_done(). i.e.
polling the card is not necessary.

Have you tried V9 or V10. There was a fix in V9 related to calling
->post_req() which could mess up DMA.

The other thing that could go wrong with DMA is if it cannot accept
->post_req() being called from mmc_request_done().