[PATCH] mmc: sdhci-msm: Add retries when all tuning phases are found valid

From: Douglas Anderson
Date: Thu Aug 27 2020 - 11:00:00 EST


As the comments in this patch say, if we tune and find all phases are
valid it's _almost_ as bad as no phases being found valid. Probably
all phases are not really reliable but we didn't detect where the
unreliable place is. That means we'll essentially be guessing and
hoping we get a good phase.

This is not just a problem in theory. It was causing real problems on
a real board. On that board, most often phase 10 is found as the only
invalid phase, though sometimes 10 and 11 are invalid and sometimes
just 11. Some percentage of the time, however, all phases are found
to be valid. When this happens, the current logic will decide to use
phase 11. Since phase 11 is sometimes found to be invalid, this is a
bad choice. Sure enough, when phase 11 is picked we often get mmc
errors later in boot.

I have seen cases where all phases were found to be valid 3 times in a
row, so increase the retry count to 10 just to be extra sure.

Fixes: 415b5a75da43 ("mmc: sdhci-msm: Add platform_execute_tuning implementation")
Signed-off-by: Douglas Anderson <dianders@xxxxxxxxxxxx>
---

drivers/mmc/host/sdhci-msm.c | 18 +++++++++++++++++-
1 file changed, 17 insertions(+), 1 deletion(-)

diff --git a/drivers/mmc/host/sdhci-msm.c b/drivers/mmc/host/sdhci-msm.c
index b7e47107a31a..1b78106681e0 100644
--- a/drivers/mmc/host/sdhci-msm.c
+++ b/drivers/mmc/host/sdhci-msm.c
@@ -1165,7 +1165,7 @@ static void sdhci_msm_set_cdr(struct sdhci_host *host, bool enable)
static int sdhci_msm_execute_tuning(struct mmc_host *mmc, u32 opcode)
{
struct sdhci_host *host = mmc_priv(mmc);
- int tuning_seq_cnt = 3;
+ int tuning_seq_cnt = 10;
u8 phase, tuned_phases[16], tuned_phase_cnt = 0;
int rc;
struct mmc_ios ios = host->mmc->ios;
@@ -1221,6 +1221,22 @@ static int sdhci_msm_execute_tuning(struct mmc_host *mmc, u32 opcode)
} while (++phase < ARRAY_SIZE(tuned_phases));

if (tuned_phase_cnt) {
+ if (tuned_phase_cnt == ARRAY_SIZE(tuned_phases)) {
+ /*
+ * All phases valid is _almost_ as bad as no phases
+ * valid. Probably all phases are not really reliable
+ * but we didn't detect where the unreliable place is.
+ * That means we'll essentially be guessing and hoping
+ * we get a good phase. Better to try a few times.
+ */
+ dev_dbg(mmc_dev(mmc), "%s: All phases valid; try again\n",
+ mmc_hostname(mmc));
+ if (--tuning_seq_cnt) {
+ tuned_phase_cnt = 0;
+ goto retry;
+ }
+ }
+
rc = msm_find_most_appropriate_phase(host, tuned_phases,
tuned_phase_cnt);
if (rc < 0)
--
2.28.0.297.g1956fa8f8d-goog