RE: [PATCH v2] mtd: rawnand: gpmi: fix MX28 bus master lockup problem

From: Han Xu
Date: Tue Feb 05 2019 - 11:28:52 EST




> -----Original Message-----
> From: Martin Kepplinger <martin.kepplinger@xxxxxxxxxxxxx>
> Sent: Tuesday, February 5, 2019 9:53 AM
> To: Han Xu <han.xu@xxxxxxx>; bbrezillon@xxxxxxxxxx;
> miquel.raynal@xxxxxxxxxxx; richard@xxxxxx; dwmw2@xxxxxxxxxxxxx;
> computersforpeace@xxxxxxxxx; marek.vasut@xxxxxxxxx; linux-
> mtd@xxxxxxxxxxxxxxxxxxx
> Cc: linux-kernel@xxxxxxxxxxxxxxx; stable@xxxxxxxxxxxxxxx; Manfred Schlaegl
> <manfred.schlaegl@xxxxxxxxxxxxx>; Fabio Estevam <festevam@xxxxxxxxx>
> Subject: [PATCH v2] mtd: rawnand: gpmi: fix MX28 bus master lockup
> problem
>
> Disable BCH soft reset according to MX23 erratum #2847 ("BCH soft
> reset may cause bus master lock up") for MX28 too. It has the same
> problem.
>
> Observed problem: once per 100,000+ MX28 reboots NAND read failed on
> DMA timeout errors:
> [ 1.770823] UBI: attaching mtd3 to ubi0
> [ 2.768088] gpmi_nand: DMA timeout, last DMA :1
> [ 3.958087] gpmi_nand: BCH timeout, last DMA :1
> [ 4.156033] gpmi_nand: Error in ECC-based read: -110
> [ 4.161136] UBI warning: ubi_io_read: error -110 while reading 64
> bytes from PEB 0:0, read only 0 bytes, retry
> [ 4.171283] step 1 error
> [ 4.173846] gpmi_nand: Chip: 0, Error -1
>
> Without BCH soft reset we successfully executed 1,000,000 MX28 reboots.
>
> I have a quote from NXP regarding this problem, from July 18th 2016:
>
> "As the i.MX23 and i.MX28 are of the same generation, they share many
> characteristics. Unfortunately, also the erratas may be shared.
> In case of the documented erratas and the workarounds, you can also
> apply the workaround solution of one device on the other one. This have
> been reported, but Iâm afraid that there are not an estimated date for
> updating the Errata documents.
> Please accept our apologies for any inconveniences this may cause."
>
> Fixes: 6f2a6a52560a ("mtd: nand: gpmi: reset BCH earlier, too, to avoid
> NAND startup problems")
> Cc: stable@xxxxxxxxxxxxxxx
> Signed-off-by: Manfred Schlaegl <manfred.schlaegl@xxxxxxxxxxxxx>
> Signed-off-by: Martin Kepplinger <martin.kepplinger@xxxxxxxxxxxxx>
> Reviewed-by: Miquel Raynal <miquel.raynal@xxxxxxxxxxx>
> Reviewed-by: Fabio Estevam <festevam@xxxxxxxxx>

Acked-by: Han Xu <han.xu@xxxxxxx>

> ---
>
>
> revision history
> ----------------
> v2: add Fixes tag, Cc stable and add recent Reviewed-by tags
>
>
> drivers/mtd/nand/raw/gpmi-nand/gpmi-lib.c | 13 ++++++-------
> 1 file changed, 6 insertions(+), 7 deletions(-)
>
> diff --git a/drivers/mtd/nand/raw/gpmi-nand/gpmi-lib.c
> b/drivers/mtd/nand/raw/gpmi-nand/gpmi-lib.c
> index bd4cfac6b5aa..a4768df5083f 100644
> --- a/drivers/mtd/nand/raw/gpmi-nand/gpmi-lib.c
> +++ b/drivers/mtd/nand/raw/gpmi-nand/gpmi-lib.c
> @@ -155,9 +155,10 @@ int gpmi_init(struct gpmi_nand_data *this)
>
> /*
> * Reset BCH here, too. We got failures otherwise :(
> - * See later BCH reset for explanation of MX23 handling
> + * See later BCH reset for explanation of MX23 and MX28 handling
> */
> - ret = gpmi_reset_block(r->bch_regs, GPMI_IS_MX23(this));
> + ret = gpmi_reset_block(r->bch_regs,
> + GPMI_IS_MX23(this) || GPMI_IS_MX28(this));
> if (ret)
> goto err_out;
>
> @@ -263,12 +264,10 @@ int bch_set_geometry(struct gpmi_nand_data
> *this)
> /*
> * Due to erratum #2847 of the MX23, the BCH cannot be soft reset on
> this
> * chip, otherwise it will lock up. So we skip resetting BCH on the
> MX23.
> - * On the other hand, the MX28 needs the reset, because one case
> has been
> - * seen where the BCH produced ECC errors constantly after 10000
> - * consecutive reboots. The latter case has not been seen on the
> MX23
> - * yet, still we don't know if it could happen there as well.
> + * and MX28.
> */
> - ret = gpmi_reset_block(r->bch_regs, GPMI_IS_MX23(this));
> + ret = gpmi_reset_block(r->bch_regs,
> + GPMI_IS_MX23(this) || GPMI_IS_MX28(this));
> if (ret)
> goto err_out;
>
> --
> 2.20.1