Re: [PATCH 3/3] mmc: dw_mmc: Add locking to the CTO timer

From: Doug Anderson
Date: Wed Oct 11 2017 - 19:54:00 EST


Hi,

On Mon, Oct 9, 2017 at 12:41 AM, Shawn Lin <shawn.lin@xxxxxxxxxxxxxx> wrote:
> Hi Doug,
>
> On 2017/9/28 4:56, Douglas Anderson wrote:
>>
>> This attempts to instill a bit of paranoia to the code dealing with
>> the CTO timer. It's believed that this will make the CTO timer more
>> robust in the case that we're having very long interrupt latencies.
>>
>
> I have already got reports about the similar problem that on one of
> the Rockchip platforms, some IP cyclicity idle the bus too long,
> so that cto timer fires but actually it shouldn't.

So presumably this patch fixes them?


>> Note that I originally thought that perhaps this patch was being
>> overly paranoid and wasn't really needed, but then while I was running
>> mmc_test on an rk3399 board I saw one instance of the message:
>> dwmmc_rockchip fe320000.dwmmc: Unexpected interrupt latency
>>
>
> The intention of introducing CTO and DRTO timers is simply to break the
> dead loop due to the bug of controller itself but it seems the timer
> should take more factors into consideration. So it's more complicated
> than expected, expecially we should also fix the drto case like what
> this patch does...
>
> How about combining the cto and drto timer and re-new a catch-all
> timer like what SDHCI did?

I don't think that having separate timers really adds a lot of
overhead though, does it? gdb shows "struct timer_list" on arm64 as
being 112 bytes so I guess there's some overhead there. ...but
presumably you'd then need to add some code to differentiate the
command timeout and data timeout. That adds more code and more
changes and makes this patch series riskier to backport to stable
trees (and, since it fixes a regression, I think it should be
backported assuming we don't land this in time for 4.14, which is
looking unlikely).

So I guess I'd say I'll keep them as separate timers in my patch
series unless someone is really upset by it and if someone wants to
see if things are cleaner by changing it to one timer then it'd be
great!


>> @@ -2600,6 +2633,8 @@ static irqreturn_t dw_mci_interrupt(int irq, void
>> *dev_id)
>> struct dw_mci *host = dev_id;
>> u32 pending;
>> struct dw_mci_slot *slot = host->slot;
>> + unsigned long irqflags;
>> + int i;

It was pointed out by Emil that I messed up while merging this from
the Chromium OS tree (on kernel 4.4) and accidentally added "i" in
here. While the patch submitted compiles and works it introduces a
stupid compiler warning.

I'll re-post tomorrow.