Re: [RFC PATCH v2 0/1] lightnvm: move bad block and chunk state logic to core
From: Javier Gonzalez
Date: Fri Aug 17 2018 - 06:49:38 EST
> On 17 Aug 2018, at 11.42, Matias BjÃrling <mb@xxxxxxxxxxx> wrote:
>
> On 08/17/2018 11:34 AM, Javier Gonzalez wrote:
>>> On 17 Aug 2018, at 11.29, Matias BjÃrling <mb@xxxxxxxxxxx> wrote:
>>>
>>> On 08/17/2018 10:44 AM, Javier Gonzalez wrote:
>>>>> On 17 Aug 2018, at 10.21, Matias BjÃrling <mb@xxxxxxxxxxx> wrote:
>>>>>
>>>>> On 08/16/2018 05:53 PM, Javier Gonzalez wrote:
>>>>>>> On 16 Aug 2018, at 13.34, Matias BjÃrling <mb@xxxxxxxxxxx> wrote:
>>>>>>>
>>>>>>> This patch moves the 1.2 and 2.0 block/chunk metadata retrieval to
>>>>>>> core.
>>>>>>>
>>>>>>> Hi Javier, I did not end up using your patch. I had misunderstood what
>>>>>>> was implemented. Instead I implemented the detection of the each chunk by
>>>>>>> first sensing the first page, then the last page, and if the chunk
>>>>>>> is sensed as open, a per page scan will be executed to update the write
>>>>>>> pointer appropriately.
>>>>>> I see why you want to do it this way for maintaining the chunk
>>>>>> abstraction, but this is potentially very inefficient as blocks not used
>>>>>> by any target will be recovered unnecessarily.
>>>>>
>>>>> True. It will up to the target to not ask for more metadata than necessary (similarly for 2.0)
>>>>>
>>>>> Note that in 1.2, it is
>>>>>> expected that targets will need to recover the write pointer themselves.
>>>>>> What is more, in the normal path, this will be part of the metadata
>>>>>> being stored so no wp recovery is needed. Still, this approach forces
>>>>>> recovery on each 1.2 instance creation (also on factory reset). In this
>>>>>> context, you are right, the patch I proposed only addresses the double
>>>>>> erase issue, which was the original motivator, and left the actual
>>>>>> pointer recovery to the normal pblk recovery process.
>>>>>> Besides this, in order to consider this as a real possibility, we need
>>>>>> to measure the impact on startup time. For this, could you implement
>>>>>> nvm_bb_scan_chunk() and nvm_bb_chunk_sense() more efficiently by
>>>>>> recovering (i) asynchronously and (ii) concurrently across luns so that
>>>>>> we can establish the recovery cost more fairly? We can look at a
>>>>>> specific penalty ranges afterwards.
>>>>>
>>>>> Honestly, 1.2 is deprecated.
>>>> For some...
>>> No. OCSSD 1.2 is deprecated. Others that have a derivative of 1.2 have
>>> their own storage stack and spec that they will continue development
>>> on, which can not be expected to be compatible with the OCSSD 1.2 that
>>> is implemented in the lightnvm subsystem.
>> There are 1.2 devices out there using the current stack with no changes. >
>
> Yes, obviously, and they should continue to work. Which this patch doesn't change.
>
>>>>> I don't care about the performance, I
>>>>> care about being easy to maintain, so it doesn't borg me down in the
>>>>> future.
>>>> This should be stated clear in the commit message.
>>>>> Back of the envelope calculation for a 64 die SSD with 1024 blocks per
>>>>> die, and 60us read time, will take 4 seconds to scan if all chunks are
>>>>> free, a worst case something like ~10 seconds. -> Not a problem for
>>>>> me.
>>>> Worst case is _much_ worse than 10s if you need to scan the block to
>>>> find the write pointer. We are talking minutes.
>>>
>>> I think you may be assuming that all blocks are open. My assumption is
>>> that this is very rare (given the NAND characteristics). At most a
>>> couple of blocks may be open per die. That leads me to the time
>>> quoted.
>> Worst case is worst case, no assumptions.
>>>> At least make the recovery reads asynchronous. It is low hanging fruit
>>>> and will help the average case significantly.
>>>>>> Also, the recovery scheme in pblk will change significantly by doing
>>>>>> this, so I assume you will send a followup patchset reimplementing
>>>>>> recovery for the 1.2 path?
>>>>>
>>>>> The 1.2 path shouldn't be necessary after this. That is the idea of
>>>>> this work. Obviously, the set bad block interface will have to
>>>>> preserved and called.
>>>> If we base this patch on top of my 2.0 recovery, we will still need to
>>>> make changes to support all 1.2 corner cases.
>>>> How do you want to do it? We get this patch in shape and I rebase on top
>>>> or the other way around?
>>>
>>> I'll pull this in when you're tested it with your 1.2 implementation.
>> Please, address the asynchronous read comment before considering pulling
>> this path. There is really no reason not to improve this.
>
> I'll accept patches, but I won't spend time on it. Please let me know if you have other comments.
Your choice to ignore my comment on a RFC at this early stage of the
4.20 window.
In any case, I tested on our 1.2 devices and the patch has passed
functionality.
Please do not add reviewed-by or tested-by tags with my name as I do not
back the performance implications of the current implementation (on an
otherwise good cleanup patch).
Javier
Attachment:
signature.asc
Description: Message signed with OpenPGP