Re: [LINUX PATCH v12 3/3] mtd: rawnand: arasan: Add support for Arasan NAND Flash Controller

From: Boris Brezillon
Date: Mon Nov 19 2018 - 03:02:53 EST


On Mon, 19 Nov 2018 06:20:28 +0000
Naga Sureshkumar Relli <nagasure@xxxxxxxxxx> wrote:

> H Boris,
>
> > -----Original Message-----
> > From: Boris Brezillon [mailto:boris.brezillon@xxxxxxxxxxx]
> > Sent: Monday, November 19, 2018 1:13 AM
> > To: Naga Sureshkumar Relli <nagasure@xxxxxxxxxx>
> > Cc: miquel.raynal@xxxxxxxxxxx; richard@xxxxxx; dwmw2@xxxxxxxxxxxxx;
> > computersforpeace@xxxxxxxxx; marek.vasut@xxxxxxxxx; linux-mtd@xxxxxxxxxxxxxxxxxxx; linux-
> > kernel@xxxxxxxxxxxxxxx; nagasuresh12@xxxxxxxxx; robh@xxxxxxxxxx; Michal Simek
> > <michals@xxxxxxxxxx>
> > Subject: Re: [LINUX PATCH v12 3/3] mtd: rawnand: arasan: Add support for Arasan NAND
> > Flash Controller
> >
> > On Thu, 15 Nov 2018 09:34:16 +0000
> > Naga Sureshkumar Relli <nagasure@xxxxxxxxxx> wrote:
> >
> > > Hi Boris & Miquel,
> > >
> > > I am updating the driver by addressing your comments, and I have one
> > > concern, especially in anfc_read_page_hwecc(), there I am checking for erased pages bit flips.
> > > Since Arasan NAND controller doesn't have multibit error detection
> > > beyond 24-bit( it can correct up to 24 bit), i.e. there is no indication from controller to detect
> > uncorrectable error beyond 24bit.
> >
> > Do you mean that you can't detect uncorrectable errors, or just that it's not 100% sure to detect
> > errors above max_strength?
> Yes, in Arasan NAND controller there is no way to detect uncorrectable errors beyond 24-bit.

So how do you detect uncorrectable errors when the strength is less than
24bits?

> >
> > > So I took some error count as default value(MULTI_BIT_ERR_CNT 16, I
> > > put this based on the error count that I got while reading erased page on Micron device).
> > > And during a page read, will just read the error count register and
> > > compare this value with the default error count(16) and if it is more Than default then I am
> > checking for erased page bit flips.
> >
> > Hm, that's wrong, especially if you set ecc_strength to something > 16.
> Ok
> >
> > > I am doubting that this will not work in all cases.
> >
> > It definitely doesn't.
> Ok
> >
> > > In my case it is just working because the error count that it got on an erased page is 16.
> > > Could you please suggest a way to do detect erased_page bit flips when reading a page with
> > HW-ECC?.
> >
> > I'm a bit lost. Is the problem only about bitflips in erase pages, or is it also impacting reads of
> > written pages that lead to uncorrectable errors.
> Yes, it is for both. But in case of read errors that we can't detect beyond 24-bit, then the answer from HW design team
> Is that the flash part is bad.
> Unfortunately till now we haven't ran into that situation(read errors of written pages beyond 24-bit).

Can you please run nandbiterrs (availaible in mtd-utils). I fear your
device won't pass the test.

> But we are hitting this because of erased page reading(needed in case of ubifs).
>
> >
> > Don't you have a bit (or several bits) reporting when the ECC engine was not able to correct
> > data? I you do, you should base the "detect bitflips in erase pages" logic on this information.
> Bit reporting for several bit errors is there only for Hamming(1bit correction and 2bit detection) but not in BCH.
>

Then I tend to agree with Miquel: your ECC engine is broken, and I'm
not even sure how to deal with that yet.