Re: [PATCH v3] cfi: fix deadloop in cfi_cmdset_0002.c do_write_buffer

From: Vignesh Raghavendra
Date: Fri Mar 01 2019 - 11:48:44 EST


[...]
>>>>> In function do_write_buffer(), in the for loop, there is a case
>>>>> chip_ready() returns 1 while chip_good() returns 0, so it never break
>>>>> the loop.
>>>>> To fix this, chip_good() is enough and it should timeout if it stay
>>>>> bad for a while.
>>>>>
>>>>> Fixes: dfeae1073583("mtd: cfi_cmdset_0002: Change write buffer to
>>>>> check correct value")
>>>>> Signed-off-by: Yi Huaijie <yihuaijie@xxxxxxxxxx>
>>>>> Signed-off-by: Liu Jian <liujian56@xxxxxxxxxx>
>>>>> Reviewed-by: Tokunori Ikegami <ikegami_to@xxxxxxxxxxx>
>>>>> ---
>>>>> v2->v3:
>>>>> Follow Vignesh's advice:
>>>>> add one more check for check_good() even when time_after() returns
>> true.
>>>>>
>>>>> drivers/mtd/chips/cfi_cmdset_0002.c | 2 +-
>>>>> 1 file changed, 1 insertion(+), 1 deletion(-)
>>>>>
>>>>> diff --git a/drivers/mtd/chips/cfi_cmdset_0002.c
>>>>> b/drivers/mtd/chips/cfi_cmdset_0002.c
>>>>> index 72428b6..3da2376 100644
>>>>> --- a/drivers/mtd/chips/cfi_cmdset_0002.c
>>>>> +++ b/drivers/mtd/chips/cfi_cmdset_0002.c
>>>>> @@ -1876,7 +1876,7 @@ static int __xipram do_write_buffer(struct
>>>>> map_info *map, struct flchip *chip,
>>>>> continue;
>>>>> }
>>>>>
>>>>> - if (time_after(jiffies, timeo) && !chip_ready(map, adr))
>>>>> + if (time_after(jiffies, timeo) && !chip_good(map, adr,
>>>>> datum))
>>>>
>>>> Just another idea to understand easily.
>>>>
>>>> unsigned long now = jiffies;
>>>>
>>>> if (chip_good(map, adr, datum)) {
>>>> xip_enable(map, chip, adr);
>>>> goto op_done;
>>>> }
>>>>
>>>> if (time_after(now, timeo) {
>>>> break;
>>>> }
>>>>
>>>
>>> Thank you~. It is more easier to understandï
>>> If there are no other comments, I will send new patch again ):
>>
>> Except this version no longer does what Vignesh suggested. See how you
>> no longer test if chip_good() is true if time_after() returns true. So,
>> imagine the thread entering this function is preempted just after the
>> first chip_good() test, and resumed a few ms later. time_after() will
>> return true, but chip_good() might also return true, and you ignore it.
>
> I think that the following 3 versions will be worked for time_after() as a same result and follow the Vignesh-san suggestion.
>

As Boris explained above version 3 does not really follow my
suggestion... Please see below

> 1. Original Vignesh-san suggestion
>
> if (chip_good(map, adr, datum)) {
> xip_enable(map, chip, adr);
> goto op_done;
> }
>
> if (time_after(jiffies, timeo)) {
> /* Test chip_good() if TRUE incorrectly again so write failure by time_after() can be avoided. */
> if (chip_good(map, adr, datum)) {
> xip_enable(map, chip, adr);
> goto op_done;
> }
> break;
> }
>


Right, here we check chip_good() once _even when time_after() is true_
to avoid _spurious_ timeout

> 2. Liujian-san v3 patch
>
> /* Test chip_good() if FALSE correctly so write failure by time_after() can be avoided. */
> if (time_after(jiffies, timeo) && !chip_good(map, adr))
> break;
>
> if (chip_good(map, adr, datum)) {
> xip_enable(map, chip, adr);
> goto op_done;
> }
>

This is a better version of 1

> 3. My idea
>
> /* Save current jiffies value before chip_good() to avoid write failure by time_after() without testing chip_good() again. */
> unsigned long now = jiffies;
>
> if (chip_good(map, adr, datum)) {
> xip_enable(map, chip, adr);
> goto op_done;
> }
>

What if thread gets pre-empted at this point and is re-scheduled exactly
after timeo jiffies have elapsed? Below check would be true and exit loop

> if (time_after(now, timeo))
> break;
>

So, code does not check for check chip_good() after timeoout has
elapsed. But chip_good() might be true at this point. Therefore leading
to spurious timeout. So this version is still not right.

> Note: Some brackets have been fixed from the previous mail.
>