Re: [PATCH v3 4/4] mtd: rawnand: micron: support 8/512 on-die ECC
From: Chris Packham
Date: Wed Jun 20 2018 - 18:22:15 EST
On 20/06/18 20:02, Boris Brezillon wrote:
> On Wed, 20 Jun 2018 17:05:44 +1200
> Chris Packham <chris.packham@xxxxxxxxxxxxxxxxxxx> wrote:
>
>> Micron MT29F1G08ABAFAWP-ITE:F supports an on-die ECC with 8 bits
>> per 512 bytes. Add support for this combination.
>>
>> Signed-off-by: Chris Packham <chris.packham@xxxxxxxxxxxxxxxxxxx>
>> ---
>> Changes in v2:
>> - New
>> Changes in v3:
>> - Handle reporting of corrected errors that don't require a rewrite, expand
>> comment for the ECC status bits.
>>
>> drivers/mtd/nand/raw/nand_micron.c | 34 ++++++++++++++++++++++++------
>> 1 file changed, 27 insertions(+), 7 deletions(-)
>>
>> diff --git a/drivers/mtd/nand/raw/nand_micron.c b/drivers/mtd/nand/raw/nand_micron.c
>> index 5cec79372181..0c2bde4411d7 100644
>> --- a/drivers/mtd/nand/raw/nand_micron.c
>> +++ b/drivers/mtd/nand/raw/nand_micron.c
>> @@ -18,10 +18,24 @@
>> #include <linux/mtd/rawnand.h>
>>
>> /*
>> - * Special Micron status bit that indicates when the block has been
>> - * corrected by on-die ECC and should be rewritten
>> + * Special Micron status bit 3 indicates that the block has been
>> + * corrected by on-die ECC and should be rewritten.
>> + *
>> + * On chips with 8-bit ECC and additional bit can be used to distinguish
>> + * cases where a errors were corrected without needing a rewrite
>> + *
>> + * Bit 4 Bit 3 Bit 0 Description
>> + * ----- ----- ----- -----------
>> + * 0 0 0 No Errors
>> + * 0 0 1 Multiple uncorrected errors
>> + * 0 1 0 4 - 6 errors corrected, recommend rewrite
>> + * 0 0 1 Reserved
>> + * 1 0 0 1 - 3 errors corrected
>> + * 1 0 1 Reserved
>> + * 1 1 0 7 - 8 errors corrected, recommend rewrite
>> */
>> #define NAND_STATUS_WRITE_RECOMMENDED BIT(3)
>> +#define NAND_STATUS_ERRORS_CORRECTED BIT(4)
>>
>> struct nand_onfi_vendor_micron {
>> u8 two_plane_read;
>> @@ -141,7 +155,7 @@ micron_nand_read_page_on_die_ecc(struct mtd_info *mtd, struct nand_chip *chip,
>> mtd->ecc_stats.failed++;
>>
>> /*
>> - * The internal ECC doesn't tell us the number of bitflips
>> + * The internal 4-bit ECC doesn't tell us the number of bitflips
>> * that have been corrected, but tells us if it recommends to
>> * rewrite the block. If it's the case, then we pretend we had
>> * a number of bitflips equal to the ECC strength, which will
>> @@ -149,6 +163,12 @@ micron_nand_read_page_on_die_ecc(struct mtd_info *mtd, struct nand_chip *chip,
>> */
>> else if (status & NAND_STATUS_WRITE_RECOMMENDED)
>> max_bitflips = chip->ecc.strength;
>> + /*
>> + * Chips with 8-bit internal ECC do tell us if errors 1 to 3 bit
>> + * errors have been corrected without recommending a rewrite.
>> + */
>> + else if (status & NAND_STATUS_ERRORS_CORRECTED)
>> + max_bitflips = 3;
>
> Why not masking bit 3, 4 and 0 and having a switch-case block?
Mainly because the existing code was just checking bit 3 and that
happened to worked for my use-case.
I'm happy to re-work it as you've suggested can anyone point me at the
datasheet for a 4/512 on-die chip (or just the part number I can lookup
on mircon's site).
>
> Also, you should update ecc_stats.corrected (see the patch I just sent
> [1]).
>
Will do. I'll pull in your patch and base v4 on top of that.
>>
>> ret = nand_read_data_op(chip, buf, mtd->writesize, false);
>> if (!ret && oob_required)
>> @@ -240,9 +260,9 @@ static int micron_supports_on_die_ecc(struct nand_chip *chip)
>>
>> /*
>> * Some Micron NANDs have an on-die ECC of 4/512, some other
>> - * 8/512. We only support the former.
>> + * 8/512.
>> */
>> - if (chip->ecc_strength_ds != 4)
>> + if (chip->ecc_strength_ds != 4 && chip->ecc_strength_ds != 8)
>> return MICRON_ON_DIE_UNSUPPORTED;
I was thinking about removing this. The original code excluded 8/512 due
to lack of access to a chip that implements this, which I now have.
> Given that our on-die-support detection procedure is not reliable, I'd
> recommend changing the way we do it and instead base this detection
> logic on the model name (in the ONFI param page) or the READ_ID bytes.
>
The problem is I don't know an exhaustive list of IDs that this applies
to. I guess having a list of known IDs and falling back to the current
detection is probably the best approach unless Micron get back to us
with some other method of detecting these and determining if it is
forceably enabled.
>>
>> return MICRON_ON_DIE_SUPPORTED;
>> @@ -274,9 +294,9 @@ static int micron_nand_init(struct nand_chip *chip)
>> return -EINVAL;
>> }
>>
>> - chip->ecc.bytes = 8;
>> + chip->ecc.bytes = chip->ecc_strength_ds * 2;
>> chip->ecc.size = 512;
>> - chip->ecc.strength = 4;
>> + chip->ecc.strength = chip->ecc_strength_ds;
>> chip->ecc.algo = NAND_ECC_BCH;
>> chip->ecc.read_page = micron_nand_read_page_on_die_ecc;
>> chip->ecc.write_page = micron_nand_write_page_on_die_ecc;
>
> [1]http://patchwork.ozlabs.org/patch/932006/
>