Re: [PATCH 5/5] lightnvm: pblk: fix erase counters on error fail
From: Javier GonzÃlez
Date: Sun Apr 23 2017 - 14:20:27 EST
> On 23 Apr 2017, at 19.59, Matias BjÃrling <mb@xxxxxxxxxxx> wrote:
>
> On 04/22/2017 11:31 AM, Javier GonzÃlez wrote:
>>> On 22 Apr 2017, at 11.22, Matias BjÃrling <mb@xxxxxxxxxxx> wrote:
>>>
>>> On 04/22/2017 01:32 AM, Javier GonzÃlez wrote:
>>>> When block erases fail, these blocks are marked bad. The number of valid
>>>> blocks in the line was not updated, which could cause an infinite loop
>>>> on the erase path.
>>>>
>>>> Fix this atomic counter and, in order to avoid taking an irq lock on the
>>>> interrupt context, make the erase counters atomic too.
>>>
>>> I can't find out where the counters are used in irq context? Can you
>>> point me in the right direction? I'll prefer for these counters to go
>>> in under the existing line_lock.
>>
>> This counter is line->blk_in_line, which is used on pblk_mark_bb. This
>> is triggered on the erase completion path. Note that we do not want to
>> wait until the recovery kicks in on the workqueue since the line might
>> start getting recycled straight away if we are close to reaching
>> capacity. This is indeed the scenario that triggers the race condition.
>>
>> Making all erase counters atomic allows us to avoid taking the
>> line_lock. Note that the counters do not depend on each other.
>>
>>>> Also, in the case that a significant number of blocks become bad in a
>>>> line, the result is the double shared metadata buffer (emeta) to stop
>>>> the pipeline until all metadata is flushed to the media. Increase the
>>>> number of metadata lines from 2 to 4 to avoid this case.
>>>
>>> How does moving to 4 lines solve this case? The way I read it is that
>>> it only postpones when this occurs?
>>
>> The chances of having more than 2 lines falling out of blocks after
>> pre-condition are slim. Adding two more lines should be enough.
>>
>>>> [...]
>>>>
>>>> -#define PBLK_DATA_LINES 2
>>>> +#define PBLK_DATA_LINES 4
>>>
>>> Why this change? I like to keep new features for 4.13. Only bugfixes for 4.12.
>>
>> This is the 4 lines referred above. I see it as a bug fix since we risk
>> stalling the pipeline if a line goes above a certain number of bad
>> blocks on initialization, but we can leave it out and fix this when we
>> add in-line metadata on the write thread for 4.12
>>
>> Thanks,
>> Javier
>
> Okay. It tickles me a bit. However, to make it pretty, some
> refactoring is needed, which we won't push for this release.
>
> Reviewed-by: Matias BjÃrling <matias@xxxxxxxxxxxx>
Yes, you are right. I think it will become much better as we implement
the FTL log and refactor the metadata path.
Thanks!
Javier
Attachment:
signature.asc
Description: Message signed with OpenPGP