Re: [PATCH] mmc: dw_mmc: Wait for data transfer after response errors

From: Russell King - ARM Linux
Date: Wed Mar 30 2016 - 13:26:29 EST

On Wed, Mar 30, 2016 at 07:16:18PM +0200, Enric Balletbo Serra wrote:
> 2016-03-24 17:22 GMT+01:00 Russell King - ARM Linux <linux@xxxxxxxxxxxxxxxx>:
> > On Thu, Mar 24, 2016 at 09:06:45AM -0700, Doug Anderson wrote:
> >> Russell,
> > ...
> >> Presumably this is similar to what you saw: the host saw the CRC error
> >> but the card knew nothing about it. Sending the stop command during
> >> this time confused the card. Presumably the card was in transfer
> >> state during this time?
> >
> > If the card was in transfer state for a command which expects a stop
> > command, and that stop command was issued after the card entered
> > the transfer state, then I'd expect the card to handle it... though
> > there's always the firmware bug issue.
> >
> > If the card hadn't entered transfer state at the time the stop command
> > was issued.. I think that's more likely to hit card firmware issues.
> >
> > With the tuning commands, there's another case you can hit though:
> > the data transfer may have completed before you get around to sending
> > the stop command.
> >
> > That's why, for sdhci, I came to the conclusion that waiting for the
> > data transfer to complete or timeout was the best solution for SDHCI.
> >
> In fact I only saw the problem with dw_mmc-exynos, on dw_mmc-rockchip
> it doesn't happen because it enables the DW_MCI_QUIRK_BROKEN_DTO
> behaviour. What does this is use a kernel timer to signal when DTO
> interrupt does NOT come. Note that if I disable this quirk I can also
> saw the problem on rockchip.
> > Maybe, if sending a STOP command does cause card firmware issues, then:
> >
> > 1) it provides evidence that trying to send a stop command on response
> > CRC error is the wrong thing to do (it was talked about making SDHCI
> > do this.)
> >
> Seems the same here, so guess is the wrong thing to do.
> > 2) it suggests that the solution I came up with for SDHCI is the better
> > solution, rather than trying to immediately recover the situation by
> > sending a STOP command.
> >
> I'm wondering if just enable this quirk on exynos too is the proper
> solution. Unfortunately I don't have enough documentation to check
> differences between those controllers.
> Also will really help have access to some hardware that uses
> dw_mmc-pltfm to check if, like on exynos, same issue is triggered.
> Anyone with the hardware who can do some tests?

I'd really suggest that the dw-mmc folk place a moritorium on quirk
flags, and instead deal with situations like this without resorting
to this kind of thing.

sdhci is a good example why the quirk flag approach is totally wrong,
and shows that it leads to an unmaintainable mess. If dw-mmc people
don't want the driver to decend into the same state that sdhci is,
then things like this should not be quirks. sdhci already has a
long-term moritorium on quirk flags until the resulting mess has been
cleaned up.

The danger that quirk flags cause is also highlighted in your mail:
it's very likely that this _isn't_ a host controller issue at all,
but a MMC protocol issue or a card issue - and the behaviour required
here is not specific to any particular host controller. The problem
with having a quirk flag for it is that you end up with some hosts
enabling it, and other hosts having it disabled only because they
haven't yet tripped over the issue.

RMK's Patch system:
FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up
according to