Re: [PATCH v1 2/2] mmc: sdhci: Use the SW timer when the HW timer cannot meet the timeout value required by the device

From: Bean Huo
Date: Wed Sep 29 2021 - 06:49:05 EST


Hi Adrian,

On Tue, 2021-09-28 at 13:18 +0300, Adrian Hunter wrote:
> On 25/09/2021 00:33, Bean Huo wrote:
> > On Fri, 2021-09-24 at 16:26 +0300, Adrian Hunter wrote:
> > > On 24/09/21 4:08 pm, Bean Huo wrote:
> > > > On Fri, 2021-09-24 at 15:17 +0300, Adrian Hunter wrote:
> > > > > > > > sdhci_writeb(host, count,
> > > > > > > > SDHCI_TIMEOUT_CONTROL);
> > > > > > > > }
> > > > > > > > The driver has detected that the hardware timer cannot
> > > > > > > > meet
> > > > > > > > the
> > > > > > > > timeout
> > > > > > > > requirements of the device, but we still use the
> > > > > > > > hardware
> > > > > > > > timer,
> > > > > > > > which will
> > > > > > > > allow potential timeout issuea . Rather than allowing a
> > > > > > > > potential
> > > > > > > > problem to exist, why can’t software timing be used to
> > > > > > > > avoid
> > > > > > > > this
> > > > > > > > problem?
> > > > > > > Timeouts aren't that accurate. The maximum is assumed
> > > > > > > still
> > > > > > > to
> > > > > > > work.
> > > > > > > mmc->max_busy_timeout is used to tell the core what the
> > > > > > > maximum
> > > > > > > is.
> > > > > > mmc->max_busy_timeout is still a representation of Host HW
> > > > > > timer
> > > > > > maximum timeout count, isn't it?
> > > > >
> > > > > Not necessarily. For SDHCI_QUIRK2_DISABLE_HW_TIMEOUT it
> > > > > would be
> > > > >
> > > > > set to zero to indicate no maximum.
> > > >
> > > > yes, this is the purpose of the patch, for the host controller
> > > > without
> > > > quirk SDHCI_QUIRK2_DISABLE_HW_TIMEOUT, if the timeout count
> > > > required by
> > > > device is beyond the HW timer max count, we choose SW timer to
> > > > avoid the HW timer timeout IRQ.
> > > >
> > > > I don't know if I get it correctly.
> > >
> > > Why can't drivers that want the behaviour just set the quirk?
> > >
> > > Drivers that do not work with the quirk, do not have to set it.
> >
> > Adrian,
> >
> > We cannot add this quirk to every host driver.
>
> I was suggesting only the ones for which it works.
>
> > This is the difference
> > on the device side.
>
> It is the host controller that has the problem, not the device.
> Hence the quirk.
>
> > In addition, I don't know what the maximum hardware
> > timer budget for each host is.
>
> mmc->max_busy_timeout is calculated by sdhci.c, or the driver can
> override the maximum count via ->get_max_timeout_count() or the
> driver
> can override mmc->max_busy_timeout.
>
> With the quirk, sdhci.c will usually set mmc->max_busy_timeout to
> zero.
> That allows timeouts greater than the hardware can support, and then,
> with the quirk, the driver will switch to a software timeout when
> needed.
>


According to your above statement, do you mean that the eMMC host
controller does not support the scenario where the data transmission
timeout value required by the eMMC device is greater than the capacity
of the eMMC host hardware data transmission timer? Unless the eMMC host
vendor accepts their eMMC host accepts the use of SW timers in this
case (adding quirks)?


> However, that won't work for every host controller, because some do
> not
> provide a completion interrupt after the timeout, even if the timeout
> interrupt is disabled.

Do you mean that if we disable the hardware timer/timeout interrupt and
use the software timer, the eMMC host controller will not trigger a
completion interrupt? Even before the SW timer expires, the data
transfer between the host and the eMMC device is complete? Is this what
you mean?


> That means they should set mmc->max_busy_timeout
> to the hardware value. Hence the quirk is needed to tell the
> difference.
>

This means this quirk is for eMMC host can privode the completion
interrupt in case HW timer is disabled?


> > Even if you use the same SOC, the
> > maximum time budget on different platforms may be different.
>
> The mmc core calculates timeouts based on the relevant standards and
> values provided by the device itself.


Yes, but the eMMC standard does not define the maximum timeout value.
Different eMMC will have different timeout values.

>
> > Assume that the maximum timeout time supported by the hardware
> > timer is
> > 100 milliseconds
>
> I realise it is an example, but 100 milliseconds is a bit low. Legacy
> host controllers have always had to deal with standard SD card and
> MMC card timeouts. SD card write timeout of 500 milliseconds for
> instance.


Yes, this is just an example. I have several platforms, they are TI,
NXP, Intel and Qcom. The timeout time of the hardware timer is
different, the greatest one is 1.3s, some are less than 500ms.

>
> > but the maximum data transmission time required by
> > the device is 150 milliseconds. In most cases, data transfer will
> > not
> > take so long. 150 is the maximum time under extreme conditions.
> > This
> > means that the device just needs to complete a data transfer within
> > > 100ms and keep the data line busy. If we still use the HW timer,
> > > it
> > will trigger a DATA LINE timeout interrupt.
> >
> > This patch does not affect scenarios where the hardware timer meets
> > the
> > max data transmission time requirements of the device. It will
> > still
> > use the hardware timer. Only when the device changes, will it
> > switch to
> > using the SW timer.
>
> Which is what the quirk does. So I am very confused why the quirk is
> no goo

Because I don't know what the maximum volume of the real hardware timer
of each host controller is. And different hosts have different timer
capacities. Meanwhile, the eMMC devices have different data
transmission timeout between the different density/series as well.


Would you please confirm your three points above? If they are true, I
agree we cannot disable hardware timers and use SW tiner on some
platforms.

Thanks in advance.

Bean