RE: [PATCH 1/1] fs:ubifs:recovery:fixup UBIFS cannot recover master node issue

From: Bean Huo 霍斌斌 (beanhuo)
Date: Mon Feb 01 2016 - 02:18:56 EST


Hi, Richard

> From: Richard Weinberger [mailto:richard@xxxxxx]
> Sent: Thursday, January 28, 2016 5:32 PM
> To: Bean Huo 霍斌斌 (beanhuo); Artem Bityutskiy; Adrian Hunter; Brian
> Norris
> Cc: linux-mtd@xxxxxxxxxxxxxxxxxxx; linux-kernel@xxxxxxxxxxxxxxx; Boris
> Brezillon; Peter Pan 潘栋 (peterpandong); Karl Zhang 张双锣 (karlzhang);
> Jason Tian 田晓强 (jasontian)
> Subject: Re: [PATCH 1/1] fs:ubifs:recovery:fixup UBIFS cannot recover master
> node issue
>
> Bean,
>
> Am 28.01.2016 um 03:42 schrieb Bean Huo 霍斌斌 (beanhuo):
> >> This needs a much more detailed explanation.
> >> In which scenarios on SLC NAND can you get such an unmountable UBIFS?
> >
> >
> > It is my mistake involved SLC NAND.
> > Definitely, SLC NAND does not exist two pages being damaged within one
> block.
> > I mean that master should be recovered as long as one good master block
> >exists.
> > I think, at least this method is more reasonable.
> > My question is that why UBI doesn't recover master node for this scenario?
>
> UBIFS assumes that on SLC NAND already written pages must not corrupt. It
> can deal with the fact that pages can get damaged while writing them (think
> of a power cut).
> But if page X is written and UBIFS moves over to X + 1, X must never corrupt.
>
> If this happens, something very nasty happened and UBIFS cannot operate
> correctly.

Right, for SLC NAND, X will not be damaged, because power loss happened
While programming X +1.

> With your patch it may mount somehow but *will* die or lose data soon or
> later as the same assumptions apply to all other UBIFS operations. It is not
> just about master nodes.
> Master nodes are the messenger.
>
> UBIFS' strict checks turned out to be very valuable in the past to identify
> driver/MTD issues.
> This is why I like them so much.

> >> Maybe UBIFS is too strict and NAND behaves differently than UBIFS
> >> expects but we need to understand it in depth.
> > For this, I think, maybe MLC NAND had not been released yet when UBI
> initial design.
> > I would like to send my version 2 patch based on your suggestion.
>
> If you can explain in detail why UBIFS' assumptions are wrong and how such
> corruptions can happen on SLC we can talk.
> But I think then we'd have to redo a lot of UBI and UBIFS code.

I will hack my patch again, and double check these strict checks.
But I still insist on Master node should always be recovered by another good master,
even if two corrupted pages exist in one block. This is more reasonable and reliable.
Of course, so far, we did not meet this scenario on SLC NAND.
Current UBIFS master node recovery mechanism totally can handle with
Power loss no matter MLC or SLC, why not let UBIFS more reliable? Two master node blocks
Just for SLC NAND?
> Thanks,
> //richard