Re: [PATCH] mmc: dw_mmc: Wait for data transfer after response errors

From: Enric Balletbo Serra
Date: Wed Mar 30 2016 - 13:16:25 EST

2016-03-24 17:22 GMT+01:00 Russell King - ARM Linux <linux@xxxxxxxxxxxxxxxx>:
> On Thu, Mar 24, 2016 at 09:06:45AM -0700, Doug Anderson wrote:
>> Russell,
> ...
>> Presumably this is similar to what you saw: the host saw the CRC error
>> but the card knew nothing about it. Sending the stop command during
>> this time confused the card. Presumably the card was in transfer
>> state during this time?
> If the card was in transfer state for a command which expects a stop
> command, and that stop command was issued after the card entered
> the transfer state, then I'd expect the card to handle it... though
> there's always the firmware bug issue.
> If the card hadn't entered transfer state at the time the stop command
> was issued.. I think that's more likely to hit card firmware issues.
> With the tuning commands, there's another case you can hit though:
> the data transfer may have completed before you get around to sending
> the stop command.
> That's why, for sdhci, I came to the conclusion that waiting for the
> data transfer to complete or timeout was the best solution for SDHCI.

In fact I only saw the problem with dw_mmc-exynos, on dw_mmc-rockchip
it doesn't happen because it enables the DW_MCI_QUIRK_BROKEN_DTO
behaviour. What does this is use a kernel timer to signal when DTO
interrupt does NOT come. Note that if I disable this quirk I can also
saw the problem on rockchip.

> Maybe, if sending a STOP command does cause card firmware issues, then:
> 1) it provides evidence that trying to send a stop command on response
> CRC error is the wrong thing to do (it was talked about making SDHCI
> do this.)

Seems the same here, so guess is the wrong thing to do.

> 2) it suggests that the solution I came up with for SDHCI is the better
> solution, rather than trying to immediately recover the situation by
> sending a STOP command.

I'm wondering if just enable this quirk on exynos too is the proper
solution. Unfortunately I don't have enough documentation to check
differences between those controllers.
Also will really help have access to some hardware that uses
dw_mmc-pltfm to check if, like on exynos, same issue is triggered.
Anyone with the hardware who can do some tests?

> Maybe dw-mmc can do something similar, but with the lack of data transfer
> timeout, maybe it's possible to do something with a kernel timer instead,
> and check what the hardware is doing after a response CRC error?
> --
> RMK's Patch system:
> FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up
> according to