Re: [PATCH v3] mmc: dw_mmc: add quirk for broken data transfer over scheme

From: Alexandru Stan
Date: Fri Dec 05 2014 - 18:56:39 EST


Seems like the BROKEN_DTO also affects EVENT_DATA_COMPLETE, sometimes we don't
see that (but we do see EVENT_XFER_COMPLETE), leave the state machine and never
come back to it to timeout. I have this happening on emmc specifically, haven't
seen it on sdmmc yet.

The fix is to also start the timer when we don't see a EVENT_DATA_COMPLETE.

More details for how I debugged this can be found at:
https://chromium-review.googlesource.com/#/c/233691/

In ~100 reboot tests I haven't seen any hangs, whereas before it was
hanging about 20% of the time.

diff --git a/drivers/mmc/host/dw_mmc.c b/drivers/mmc/host/dw_mmc.c
index 50865e7..c678206 100644
--- a/drivers/mmc/host/dw_mmc.c
+++ b/drivers/mmc/host/dw_mmc.c
@@ -1685,8 +1685,17 @@ static void dw_mci_tasklet_func(unsigned long priv)

case STATE_DATA_BUSY:
if (!test_and_clear_bit(EVENT_DATA_COMPLETE,
- &host->pending_events))
+ &host->pending_events)) {
+ if (host->quirks & DW_MCI_QUIRK_BROKEN_DTO) {
+ drto_clks = mci_readl(host, TMOUT) >> 8;
+ drto_ms = DIV_ROUND_UP(drto_clks * 1000,
+ host->bus_hz);
+
+ mod_timer(&host->dto_timer, jiffies +
+ msecs_to_jiffies(drto_ms));
+ }
break;
+ }

host->data = NULL;
set_bit(EVENT_DATA_COMPLETE, &host->completed_events);
Alexandru Stan (amstan)


On Tue, Dec 2, 2014 at 9:08 PM, Doug Anderson <dianders@xxxxxxxxxxxx> wrote:
> Addy,
>
> On Tue, Dec 2, 2014 at 7:16 PM, Addy Ke <addy.ke@xxxxxxxxxxxxxx> wrote:
>> This patch add a new quirk to add a s/w timer to notify the driver
>> to terminate current transfer and report a data timeout to the core,
>> if DTO interrupt does NOT come within the given time.
>>
>> dw_mmc call mmc_request_done func to finish transfer depends on
>> DTO interrupt. If DTO interrupt does not come in sending data state,
>> the current transfer will be blocked.
>>
>> But this case really exists, when driver reads tuning data from
>> card on RK3288-pink2 board. I measured waveforms by oscilloscope
>> and found that card clock was always on and data lines were always
>> holded high level in sending data state.
>>
>> We got the reply from synopsys:
>> There are two counters but both use the same value of [31:8] bits.
>> Data timeout counter doesn't wait for stop clock and you should get
>> DRTO even when the clock is not stopped.
>> Host Starvation timeout counter is triggered with stop clock condition.
>>
>> This means that host should get DRTO and DTO interrupt.
>>
>> But we really don't get any data-related interrupt in RK3X SoCs.
>> And driver can't get data transfer state, it can do nothing but wait for.
>>
>> We don't know why we have this problem, but we need it to fix this problem now.
>> And I will post a follow up change when we find the root cause.
>>
>> Signed-off-by: Addy Ke <addy.ke@xxxxxxxxxxxxxx>
>> ---
>> Changes in v2:
>> - fix some typo.
>> - remove extra timeout value (250ms).
>> - remove dw_mci_dto_start_monitor func.
>> - use "broken-dto" for new quirk and change Subject for it.
>>
>> Changes in v3:
>> - Remove dts for broken-dto, just add this quirk in dw_mci_rockchip_init
>>
>> drivers/mmc/host/dw_mmc-rockchip.c | 3 +++
>> drivers/mmc/host/dw_mmc.c | 41 +++++++++++++++++++++++++++++++++++++-
>> include/linux/mmc/dw_mmc.h | 5 +++++
>> 3 files changed, 48 insertions(+), 1 deletion(-)
>
> This seems reasonable to me. I do hope that you can get to a root
> cause, but I don't think this is a terrible bit of code to carry.
> Obviously it's up to the folks who maintain dw_mmc and the mmc
> subsystem, but:
>
> Reviewed-by: Doug Anderson <dianders@xxxxxxxxxxxx>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/