RE: [PATCH] mmc: core: add a power cycle when CMD11 fails

From: DooHyun Hwang
Date: Wed Mar 03 2021 - 07:17:43 EST


On Tue, 2 Mar 2021 at 10:38, Ulf Hansson <ulf.hansson@xxxxxxxxxx> wrote:
>On Wed, 10 Feb 2021 at 06:12, DooHyun Hwang <dh0421.hwang@xxxxxxxxxxx>
>wrote:
>>
>> A power cycle is required if CMD11 fails.
>> CMD11 failure should be handled as no response.
>>
>> If there is a timeout error that means no response to the CMD11, do
>> not send the CMD11 again and the power cycle is required.
>> Any other errors for CMD11 are the same because CMD11 failed.
>>
>> On some bad SD Card, CMD11 may fail but the card may have already
>> invoked the voltage switch sequence.
>> In this case, it is necessary to retry without voltage switching after
>> power cycle.
>>
>> Signed-off-by: DooHyun Hwang <dh0421.hwang@xxxxxxxxxxx>
>
>Applied for next, thanks!
>
>I took the liberty of updating the commit message a bit, to try to clarify
>things. Moreover, I have tagged this for stable kernels.
>
>BTW, did you try to force the error to -EAGAIN, to keep retrying for a
>couple of times? If so, did it end up with the same kind of errors?

Thank you for reviewing this.

Yes. I tested with 2 SD cards.
I think the power cycle is needed before retrying
because SD card doesn't respond SD_ROCR_S18A when retrying without power cycle.


#1. No change
// send ACMD41 + with SD_OCR_S18R (bit[24])
<7>[ 100.852818] [0: kworker/0:0: 5] mmc0: starting CMD41 arg 51040000 flags 000000e1
// resp ACMD41 + with SD_ROCR_S18A (bit[24])
<7>[ 100.853177] I[0: Binder:4670_3: 4705] mmc0: req done (CMD41): 0: c1ff8000 00000000 00000000 00000000
// send CMD11 and error occurs
<7>[ 100.853288] [0: kworker/0:0: 5] mmc0: starting CMD11 arg 00000000 flags 00000015
<7>[ 100.854009] I[0: Binder:4670_3: 4705] mmc0: req done (CMD11): -84: 00000000 00000000 00000000 00000000
// clear SD_OCR_S18R
<4>[ 100.854145] [0: kworker/0:0: 5] mmc0: Skipping voltage switch
<7>[ 100.854179] [0: kworker/0:0: 5] mmc0: clock 400000Hz busmode 2 powermode 2 cs 1 Vdd 18 width 1 timing 0
<7>[ 100.855651] [0: kworker/0:0: 5] mmc0: starting CMD0 arg 00000000 flags 000000c0
<7>[ 100.855868] I[0: Binder:4670_3: 4705] mmc0: req done (CMD0): 0: 00000000 00000000 00000000 00000000
<7>[ 100.857234] [0: kworker/0:0: 5] mmc0: clock 400000Hz busmode 2 powermode 2 cs 0 Vdd 18 width 1 timing 0
<7>[ 100.858638] [0: kworker/0:0: 5] mmc0: starting CMD8 arg 000001aa flags 000002f5
<7>[ 100.859100] I[0: Binder:4670_3: 4705] mmc0: req done (CMD8): -84: 00000000 00000000 00000000 00000000
<7>[ 100.859607] [0: kworker/0:0: 5] mmc0: starting CMD55 arg 00000000 flags 000000f5
<7>[ 100.860098] I[0: Binder:4670_3: 4705] mmc0: req done (CMD55): -84: 00000000 00000000 00000000 00000000
...
<3>[ 100.861846] [0: kworker/0:0: 5] mmc0: error -84 whilst initialising SD card


#2. I tried to force the error to -EAGAIN(without power cycle), has confirmed
that the normal operation but card responded without SD_ROCR_S18A from ACMD41
So, Voltage Switching was skipped and the SD card initialized to HS mode.
// send ACMD41 + with SD_OCR_S18R (bit[24])
<7>[ 117.525089] [0: kworker/0:2: 375] mmc0: starting CMD41 arg 51040000 flags 000000e1
// resp ACMD41 + with SD_ROCR_S18A (bit[24])
<7>[ 117.525438] I[0:id.app.reminder: 8908] mmc0: req done (CMD41): 0: c1ff8000 00000000 00000000 00000000
// send CMD11 and complete
<7>[ 117.525505] [0: kworker/0:2: 375] mmc0: starting CMD11 arg 00000000 flags 00000015
<7>[ 117.525866] I[0:id.app.reminder: 8908] mmc0: req done (CMD11): 0: 00000320 00000000 00000000 00000000
<7>[ 117.527296] [0: kworker/0:2: 375] mmc0: clock 0Hz busmode 2 powermode 2 cs 0 Vdd 18 width 1 timing 0
<7>[ 117.540116] [0: kworker/0:2: 375] mmc0: clock 400000Hz busmode 2 powermode 2 cs 0 Vdd 18 width 1 timing 0
// set -EAGAIN
<3>[ 117.541650] [0: kworker/0:2: 375] mmc_sd_get_cid: rocr=0xc1ff8000, retries=10. err=0 -> -11.
// retry without power cycle
<7>[ 117.541683] [0: kworker/0:2: 375] mmc0: clock 400000Hz busmode 2 powermode 2 cs 1 Vdd 18 width 1 timing 0
<7>[ 117.543106] [0: kworker/0:2: 375] mmc0: starting CMD0 arg 00000000 flags 000000c0
<7>[ 117.543323] I[0:id.app.reminder: 8908] mmc0: req done (CMD0): 0: 00000000 00000000 00000000 00000000
<7>[ 117.544679] [0: kworker/0:2: 375] mmc0: clock 400000Hz busmode 2 powermode 2 cs 0 Vdd 18 width 1 timing 0
<7>[ 117.546131] [0: kworker/0:2: 375] mmc0: starting CMD8 arg 000001aa flags 000002f5
<7>[ 117.546484] I[0:id.app.reminder: 8908] mmc0: req done (CMD8): 0: 000001aa 00000000 00000000 00000000
// send ACMD41 + with SD_OCR_S18R (bit[24])
<7>[ 117.559967] [0: kworker/0:2: 375] mmc0: starting CMD55 arg 00000000 flags 000000f5
<7>[ 117.560445] I[0: ksoftirqd/0: 10] mmc0: req done (CMD55): 0: 00000120 00000000 00000000 00000000
<7>[ 117.560505] [0: kworker/0:2: 375] mmc0: starting CMD41 arg 51040000 flags 000000e1
// resp ACMD41 + without SD_ROCR_S18A (bit[24])
<7>[ 117.560853] I[0:Runtime worker : 8942] mmc0: req done (CMD41): 0: c0ff8000 00000000 00000000 00000000
<7>[ 117.562093] [0: kworker/0:2: 375] mmc0: starting CMD2 arg 00000000 flags 00000007


#3. Forced set error of CMD11 and the SD card initialized to HS mode.
// send ACMD41 + with SD_OCR_S18R (bit[24])
<7>[ 115.040643] [0: kworker/0:2: 690] mmc0: starting CMD41 arg 51040000 flags 000000e1
// resp ACMD41 + with SD_ROCR_S18A (bit[24])
<7>[ 115.040990] I[0:Jit thread pool: 8904] mmc0: req done (CMD41): 0: c1ff8000 00000000 00000000 00000000
// send CMD11 and error occurs
<7>[ 115.041092] [0: kworker/0:2: 690] mmc0: starting CMD11 arg 00000000 flags 00000015
<7>[ 115.041789] I[0:Jit thread pool: 8904] mmc0: req done (CMD11): -84: 00000000 00000000 00000000 00000000
// retry with power cycle (tested with this patch)
<7>[ 115.041855] [0: kworker/0:2: 690] mmc0: Signal voltage switch failed, power cycling card
<7>[ 115.041898] [0: kworker/0:2: 690] mmc0: clock 0Hz busmode 2 powermode 0 cs 0 Vdd 0 width 1 timing 0
<7>[ 115.045922] [0: kworker/0:2: 690] mmc0: clock 0Hz busmode 2 powermode 1 cs 0 Vdd 18 width 1 timing 0
<7>[ 115.064130] [0: kworker/0:2: 690] mmc0: clock 400000Hz busmode 2 powermode 2 cs 0 Vdd 18 width 1 timing 0
// return error and retry after (retries = 0;)
<3>[ 115.076912] [0: kworker/0:2: 690] mmc_sd_get_cid: rocr=0xc1ff8000, retries=10. err=-84.
// clear SD_OCR_S18R
<4>[ 115.076943] [0: kworker/0:2: 690] mmc0: Skipping voltage switch
<7>[ 115.076973] [0: kworker/0:2: 690] mmc0: clock 400000Hz busmode 2 powermode 2 cs 1 Vdd 18 width 1 timing 0
<7>[ 115.078375] [0: kworker/0:2: 690] mmc0: starting CMD0 arg 00000000 flags 000000c0
<7>[ 115.078617] I[0: swapper/0: 0] mmc0: req done (CMD0): 0: 00000000 00000000 00000000 00000000
<7>[ 115.079980] [0: kworker/0:2: 690] mmc0: clock 400000Hz busmode 2 powermode 2 cs 0 Vdd 18 width 1 timing 0
<7>[ 115.081506] [0: kworker/0:2: 690] mmc0: starting CMD8 arg 000001aa flags 000002f5
<7>[ 115.081868] I[0: swapper/0: 0] mmc0: req done (CMD8): 0: 000001aa 00000000 00000000 00000000
// send ACMD41 + without SD_OCR_S18R (bit[24])
<7>[ 115.190656] [0: kworker/0:2: 690] mmc0: starting CMD55 arg 00000000 flags 000000f5
<7>[ 115.191046] I[0:Jit thread pool: 4475] mmc0: req done (CMD55): 0: 00000120 00000000 00000000 00000000
<7>[ 115.191113] [0: kworker/0:2: 690] mmc0: starting CMD41 arg 50040000 flags 000000e1
// resp ACMD41 + without SD_ROCR_S18A (bit[24])
<7>[ 115.191474] I[0:Jit thread pool: 4475] mmc0: req done (CMD41): 0: c0ff8000 00000000 00000000 00000000
<7>[ 115.191541] [0: kworker/0:2: 690] mmc0: starting CMD2 arg 00000000 flags 00000007


#4. SD card responded with SD_ROCR_S18A from ACMD41 after power cycle, and change CMD11's error value to -EAGAIN
// send ACMD41 + with SD_OCR_S18R (bit[24])
<7>[ 156.884623] [0: kworker/0:1: 7] mmc0: starting CMD41 arg 51040000 flags 000000e1
// resp ACMD41 + with SD_ROCR_S18A (bit[24])
<7>[ 156.884975] I[0: highpool[3]: 5440] mmc0: req done (CMD41): 0: c1ff8000 00000000 00000000 00000000
// send CMD11 and error occurs
<7>[ 156.885051] [0: kworker/0:1: 7] mmc0: starting CMD11 arg 00000000 flags 00000015
<7>[ 156.885759] I[0: highpool[3]: 5440] mmc0: req done (CMD11): -84: 00000000 00000000 00000000 00000000
// retry with power cycle (tested with this patch)
<7>[ 156.885834] [0: kworker/0:1: 7] mmc0: Signal voltage switch failed, power cycling card
<7>[ 156.885875] [0: kworker/0:1: 7] mmc0: clock 0Hz busmode 2 powermode 0 cs 0 Vdd 0 width 1 timing 0
<7>[ 156.920185] [0: kworker/0:1: 7] mmc0: clock 400000Hz busmode 2 powermode 2 cs 0 Vdd 18 width 1 timing 0
// change CMD11's error value to -EAGAIN
<3>[ 156.932288] [0: kworker/0:1: 7] mmc_sd_get_cid: rocr=0xc1ff8000, retries=10. err=-84 -> -11.
<7>[ 156.932336] [0: kworker/0:1: 7] mmc0: clock 400000Hz busmode 2 powermode 2 cs 1 Vdd 18 width 1 timing 0
<7>[ 156.933834] [0: kworker/0:1: 7] mmc0: starting CMD0 arg 00000000 flags 000000c0
<7>[ 156.934101] I[0:ung.android.mdx:10229] mmc0: req done (CMD0): 0: 00000000 00000000 00000000 00000000
<7>[ 156.935562] [0: kworker/0:1: 7] mmc0: clock 400000Hz busmode 2 powermode 2 cs 0 Vdd 18 width 1 timing 0
<7>[ 156.937009] [0: kworker/0:1: 7] mmc0: starting CMD8 arg 000001aa flags 000002f5
<7>[ 156.937372] I[0:ung.android.mdx:10229] mmc0: req done (CMD8): 0: 000001aa 00000000 00000000 00000000
// send ACMD41 + with SD_OCR_S18R (bit[24])
<7>[ 157.044190] [0: kworker/0:1: 7] mmc0: starting CMD55 arg 00000000 flags 000000f5
<7>[ 157.044924] I[0:ung.android.mdx:10229] mmc0: req done (CMD55): 0: 00000120 00000000 00000000 00000000
<7>[ 157.045023] [0: kworker/0:1: 7] mmc0: starting CMD41 arg 51040000 flags 000000e1
// resp ACMD41 + with SD_ROCR_S18A (bit[24])
<7>[ 157.045389] I[0:ung.android.mdx:10229] mmc0: req done (CMD41): 0: c1ff8000 00000000 00000000 00000000
// send CMD11 and complete
<7>[ 157.045467] [0: kworker/0:1: 7] mmc0: starting CMD11 arg 00000000 flags 00000015
<7>[ 157.045855] I[0: Binder:798_2: 805] mmc0: req done (CMD11): 0: 00000320 00000000 00000000 00000000


>
>> ---
>> drivers/mmc/core/core.c | 2 +-
>> 1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/drivers/mmc/core/core.c b/drivers/mmc/core/core.c index
>> 1136b859ddd8..a6674df2a7bb 100644
>> --- a/drivers/mmc/core/core.c
>> +++ b/drivers/mmc/core/core.c
>> @@ -1207,7 +1207,7 @@ int mmc_set_uhs_voltage(struct mmc_host *host,
>> u32 ocr)
>>
>> err = mmc_wait_for_cmd(host, &cmd, 0);
>> if (err)
>> - return err;
>> + goto power_cycle;
>>
>> if (!mmc_host_is_spi(host) && (cmd.resp[0] & R1_ERROR))
>> return -EIO;
>> --
>> 2.29.0
>>
>
>Kind regards
>Uffe

Thanks and regards.
DooHyun Hwang.