Re: [PATCH 1/1] mmc: meson-mx-sdio: Set MMC_CAP_WAIT_WHILE_BUSY

From: Ulf Hansson
Date: Wed Apr 15 2020 - 08:59:43 EST


On Fri, 10 Apr 2020 at 23:30, Martin Blumenstingl
<martin.blumenstingl@xxxxxxxxxxxxxx> wrote:
>
> The Meson SDIO controller uses the DAT0 lane for hardware busy
> detection. Set MMC_CAP_WAIT_WHILE_BUSY accordingly. This fixes
> the following error observed with Linux 5.7 (pre-rc-1):
> mmc1: Card stuck being busy! __mmc_poll_for_busy
> blk_update_request: I/O error, dev mmcblk1, sector 17111080 op
> 0x3:(DISCARD) flags 0x0 phys_seg 1 prio class 0
>
> Signed-off-by: Martin Blumenstingl <martin.blumenstingl@xxxxxxxxxxxxxx>
> ---
> I'm sending this as RFC because I'm not sure if this is a proper fix.
> It "fixes" the issue for me but I want the MMC maintainers to double-
> check this.

Thanks for sending this! I assume it's a regression and caused by one
of my patches that went in for 5.7. Probably this one:
0d84c3e6a5b2 mmc: core: Convert to mmc_poll_for_busy() for erase/trim/discard

Now, even if enabling MMC_CAP_WAIT_WHILE_BUSY seems like the correct
thing to do, I suggest we really try understand why it works, so we
don't overlook some other issue that needs to be fixed.

Would you be willing to try a few debug patches, according to the below?

First, can you double check so the original polling with CMD13 is
still okay, by trying the below minor change. The intent is to force
polling with CMD13 for the erase/discard operation.

diff --git a/drivers/mmc/core/mmc_ops.c b/drivers/mmc/core/mmc_ops.c
index baa6314f69b4..bbf1dff0ae2a 100644
--- a/drivers/mmc/core/mmc_ops.c
+++ b/drivers/mmc/core/mmc_ops.c
@@ -452,7 +452,7 @@ static int mmc_busy_status(struct mmc_card *card,
bool retry_crc_err,
u32 status = 0;
int err;

- if (host->ops->card_busy) {
+ if (host->ops->card_busy && busy_cmd != MMC_BUSY_ERASE) {
*busy = host->ops->card_busy(host);
return 0;
}
--

Second, if the above works, it looks like the polling with
->card_busy() isn't really working for meson-mx-sdio.c, together with
erase/discard. To narrow down that problem, I suggest to try with a
longer erase/discard timeout in a retry fashion, while using
->card_busy(). Along the lines of the below:

diff --git a/drivers/mmc/core/core.c b/drivers/mmc/core/core.c
index 4c5de6d37ac7..240e52fcdf2d 100644
--- a/drivers/mmc/core/core.c
+++ b/drivers/mmc/core/core.c
@@ -1746,6 +1746,11 @@ static int mmc_do_erase(struct mmc_card *card,
unsigned int from,

/* Let's poll to find out when the erase operation completes. */
err = mmc_poll_for_busy(card, busy_timeout, MMC_BUSY_ERASE);
+ if (err) {
+ pr_err("%s: Erase poll failed err=%d timeout_ms=%u - retry!\n",
+ mmc_hostname(host), err, busy_timeout);
+ err = mmc_poll_for_busy(card, 30000, MMC_BUSY_ERASE);
+ }

out:
mmc_retune_release(card->host);
--

Let's see what this gives us...

Kind regards
Uffe

>
>
> drivers/mmc/host/meson-mx-sdio.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/mmc/host/meson-mx-sdio.c b/drivers/mmc/host/meson-mx-sdio.c
> index 8b038e7b2cd3..fe02130237a8 100644
> --- a/drivers/mmc/host/meson-mx-sdio.c
> +++ b/drivers/mmc/host/meson-mx-sdio.c
> @@ -570,7 +570,7 @@ static int meson_mx_mmc_add_host(struct meson_mx_mmc_host *host)
> mmc->f_max = clk_round_rate(host->cfg_div_clk,
> clk_get_rate(host->parent_clk));
>
> - mmc->caps |= MMC_CAP_ERASE | MMC_CAP_CMD23;
> + mmc->caps |= MMC_CAP_ERASE | MMC_CAP_CMD23 | MMC_CAP_WAIT_WHILE_BUSY;
> mmc->ops = &meson_mx_mmc_ops;
>
> ret = mmc_of_parse(mmc);
> --
> 2.26.0
>