Re: fsl_ifc_nand: are blank pages protected by ECC?

From: Pavel Machek
Date: Fri Apr 21 2017 - 06:08:24 EST


Hi!

(Added driver author to the cc list, maybe he can help).

> > Hi!
> >
> > We have some problems with fsl_ifc_nand ... in the old kernels, but
> > this one does not seem to be fixed in v4.11, either.
> >
> > UBIFS complains:
> >
> > UBIFS error (pid 931): ubifs_scan: corrupt empty space at LEB 282:252630
> > UBIFS error (pid 931): ubifs_scanned_corruption: corruption at LEB 282:252630
> > UBIFS error (pid 931): ubifs_scanned_corruption: first 1322 bytes from LEB 282:252630
> > UBIFS error (pid 931): ubifs_scan: LEB 282 scanning failed
> >
> > Possible explanation is here:
> >
> > https://e2e.ti.com/support/dsp/davinci_digital_media_processors/f/716/t/289605
> >
> > # I see on the forum that this issue has been raised before - my
> > # understanding is that the omap2 nand driver does not perform ECC
> > # detection/correction on empty pages so when UBIFS checks the empty
> > # space data and doesn't read all 0xFF then it fails and mounts
> > # read-only. I didn't find any good solution - only a workaround to
> > # remove the UBIFS check..
> >
> > So I checked fsl_ifc_nand.c in v4.11-rc, and yes, it seems to have the
> > same problem:
> >
> > if (errors == 15) {
> > /*
> > * Uncorrectable error.
> > * OK only if the whole page is blank.
> > *
> > * We disable ECCER reporting due to...
> > * erratum IFC-A002770 -- so report it now if we
> > * see an uncorrectable error in ECCSTAT.
> > */
> > if (!is_blank(mtd, bufnum))
> > ctrl->nand_stat |=
> > IFC_NAND_EVTER_STAT_ECCER;
> > break;
> > }
> >
> > is_blank() checks for all 0xff's, so single-bit 0xfe in the data will
> > result in_blank() == 0 and uncorrectable error being signaled.
> >
> > Should the driver be modified somehow?
>
> Yep, nand_check_erased_ecc_chunk() [1] is here to help you check this
> case, unfortunately, it's not directly applicable here, because this
> function takes regular pointers and not __iomem ones. You'll either
> have to copy the data in an intermediate buffer before calling
> nand_check_erased_ecc_chunk(), or cast the SRAM region to a void
> pointer (which is usually not a good idea). The last option would be to
> open code nand_check_erased_ecc_chunk(), but I'd really like to avoid
> that (for maintainability concerns).

Ok, took a look. __iomem is part of a problem, another part is that
nand_check_erased_ecc_chunk() needs to actually write back 0xff's to
undo the corruption, which would probably be bad idea to do in the
iomem, and next one is that blank actually checks arbitrary number of
regions, based on ecc.layout.

So this could be used to simplify the code (if nand_check_erased_buf
was exported; it is not), but it does not fix the problem as we still
need to undo the corruption.

Hints welcome, especially if you know right place where to put this
checking.

(BTW, switching to ecc.mode = ECC_SOFT will cause compatibility
problems but should make the problem go away, right?)

Thanks,
Pavel

diff --git a/drivers/mtd/nand/fsl_ifc_nand.c b/drivers/mtd/nand/fsl_ifc_nand.c
index d1570f5..df02d4c 100644
--- a/drivers/mtd/nand/fsl_ifc_nand.c
+++ b/drivers/mtd/nand/fsl_ifc_nand.c
@@ -181,17 +181,15 @@ static int is_blank(struct mtd_info *mtd, unsigned int bufnum)
struct mtd_oob_region oobregion = { };
int i, section = 0;

- for (i = 0; i < mtd->writesize / 4; i++) {
- if (__raw_readl(&mainarea[i]) != 0xffffffff)
- return 0;
- }
+ i = nand_check_erased_buf(&mainarea[i], mtd->writesize, 0);
+ if (i)
+ return 0;

mtd_ooblayout_ecc(mtd, section++, &oobregion);
while (oobregion.length) {
- for (i = 0; i < oobregion.length; i++) {
- if (__raw_readb(&oob[oobregion.offset + i]) != 0xff)
- return 0;
- }
+ i = nand_check_erased_buf(&oob[oobregion.offset], oobregion.length, 0);
+ if (i)
+ return 0;

mtd_ooblayout_ecc(mtd, section++, &oobregion);
}




--
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

Attachment: signature.asc
Description: Digital signature