[PATCH 4.4 81/86] UBI: Fix static volume checks when Fastmap is used

From: Greg Kroah-Hartman
Date: Mon May 30 2016 - 16:53:46 EST


4.4-stable review patch. If anyone has any objections, please let me know.

------------------

From: Richard Weinberger <richard@xxxxxx>

commit 1900149c835ab5b48bea31a823ea5e5a401fb560 upstream.

Ezequiel reported that he's facing UBI going into read-only
mode after power cut. It turned out that this behavior happens
only when updating a static volume is interrupted and Fastmap is
used.

A possible trace can look like:
ubi0 warning: ubi_io_read_vid_hdr [ubi]: no VID header found at PEB 2323, only 0xFF bytes
ubi0 warning: ubi_eba_read_leb [ubi]: switch to read-only mode
CPU: 0 PID: 833 Comm: ubiupdatevol Not tainted 4.6.0-rc2-ARCH #4
Hardware name: SAMSUNG ELECTRONICS CO., LTD. 300E4C/300E5C/300E7C/NP300E5C-AD8AR, BIOS P04RAP 10/15/2012
0000000000000286 00000000eba949bd ffff8800c45a7b38 ffffffff8140d841
ffff8801964be000 ffff88018eaa4800 ffff8800c45a7bb8 ffffffffa003abf6
ffffffff850e2ac0 8000000000000163 ffff8801850e2ac0 ffff8801850e2ac0
Call Trace:
[<ffffffff8140d841>] dump_stack+0x63/0x82
[<ffffffffa003abf6>] ubi_eba_read_leb+0x486/0x4a0 [ubi]
[<ffffffffa00453b3>] ubi_check_volume+0x83/0xf0 [ubi]
[<ffffffffa0039d97>] ubi_open_volume+0x177/0x350 [ubi]
[<ffffffffa00375d8>] vol_cdev_open+0x58/0xb0 [ubi]
[<ffffffff8124b08e>] chrdev_open+0xae/0x1d0
[<ffffffff81243bcf>] do_dentry_open+0x1ff/0x300
[<ffffffff8124afe0>] ? cdev_put+0x30/0x30
[<ffffffff81244d36>] vfs_open+0x56/0x60
[<ffffffff812545f4>] path_openat+0x4f4/0x1190
[<ffffffff81256621>] do_filp_open+0x91/0x100
[<ffffffff81263547>] ? __alloc_fd+0xc7/0x190
[<ffffffff812450df>] do_sys_open+0x13f/0x210
[<ffffffff812451ce>] SyS_open+0x1e/0x20
[<ffffffff81a99e32>] entry_SYSCALL_64_fastpath+0x1a/0xa4

UBI checks static volumes for data consistency and reads the
whole volume upon first open. If the volume is found erroneous
users of UBI cannot read from it, but another volume update is
possible to fix it. The check is performed by running
ubi_eba_read_leb() on every allocated LEB of the volume.
For static volumes ubi_eba_read_leb() computes the checksum of all
data stored in a LEB. To verify the computed checksum it has to read
the LEB's volume header which stores the original checksum.
If the volume header is not found UBI treats this as fatal internal
error and switches to RO mode. If the UBI device was attached via a
full scan the assumption is correct, the volume header has to be
present as it had to be there while scanning to get known as mapped.
If the attach operation happened via Fastmap the assumption is no
longer correct. When attaching via Fastmap UBI learns the mapping
table from Fastmap's snapshot of the system state and not via a full
scan. It can happen that a LEB got unmapped after a Fastmap was
written to the flash. Then UBI can learn the LEB still as mapped and
accessing it returns only 0xFF bytes. As UBI is not a FTL it is
allowed to have mappings to empty PEBs, it assumes that the layer
above takes care of LEB accounting and referencing.
UBIFS does so using the LEB property tree (LPT).
For static volumes UBI blindly assumes that all LEBs are present and
therefore special actions have to be taken.

The described situation can happen when updating a static volume is
interrupted, either by a user or a power cut.
The volume update code first unmaps all LEBs of a volume and then
writes LEB by LEB. If the sequence of operations is interrupted UBI
detects this either by the absence of LEBs, no volume header present
at scan time, or corrupted payload, detected via checksum.
In the Fastmap case the former method won't trigger as no scan
happened and UBI automatically thinks all LEBs are present.
Only by reading data from a LEB it detects that the volume header is
missing and incorrectly treats this as fatal error.
To deal with the situation ubi_eba_read_leb() from now on checks
whether we attached via Fastmap and handles the absence of a
volume header like a data corruption error.
This way interrupted static volume updates will correctly get detected
also when Fastmap is used.

Reported-by: Ezequiel Garcia <ezequiel@xxxxxxxxxxxxxxxxxxxx>
Tested-by: Ezequiel Garcia <ezequiel@xxxxxxxxxxxxxxxxxxxx>
Signed-off-by: Richard Weinberger <richard@xxxxxx>
Signed-off-by: Greg Kroah-Hartman <gregkh@xxxxxxxxxxxxxxxxxxx>

---
drivers/mtd/ubi/eba.c | 21 +++++++++++++++++++--
drivers/mtd/ubi/fastmap.c | 1 +
drivers/mtd/ubi/ubi.h | 2 ++
3 files changed, 22 insertions(+), 2 deletions(-)

--- a/drivers/mtd/ubi/eba.c
+++ b/drivers/mtd/ubi/eba.c
@@ -426,8 +426,25 @@ retry:
pnum, vol_id, lnum);
err = -EBADMSG;
} else {
- err = -EINVAL;
- ubi_ro_mode(ubi);
+ /*
+ * Ending up here in the non-Fastmap case
+ * is a clear bug as the VID header had to
+ * be present at scan time to have it referenced.
+ * With fastmap the story is more complicated.
+ * Fastmap has the mapping info without the need
+ * of a full scan. So the LEB could have been
+ * unmapped, Fastmap cannot know this and keeps
+ * the LEB referenced.
+ * This is valid and works as the layer above UBI
+ * has to do bookkeeping about used/referenced
+ * LEBs in any case.
+ */
+ if (ubi->fast_attach) {
+ err = -EBADMSG;
+ } else {
+ err = -EINVAL;
+ ubi_ro_mode(ubi);
+ }
}
}
goto out_free;
--- a/drivers/mtd/ubi/fastmap.c
+++ b/drivers/mtd/ubi/fastmap.c
@@ -1058,6 +1058,7 @@ int ubi_scan_fastmap(struct ubi_device *
ubi_msg(ubi, "fastmap WL pool size: %d",
ubi->fm_wl_pool.max_size);
ubi->fm_disabled = 0;
+ ubi->fast_attach = 1;

ubi_free_vid_hdr(ubi, vh);
kfree(ech);
--- a/drivers/mtd/ubi/ubi.h
+++ b/drivers/mtd/ubi/ubi.h
@@ -462,6 +462,7 @@ struct ubi_debug_info {
* @fm_eba_sem: allows ubi_update_fastmap() to block EBA table changes
* @fm_work: fastmap work queue
* @fm_work_scheduled: non-zero if fastmap work was scheduled
+ * @fast_attach: non-zero if UBI was attached by fastmap
*
* @used: RB-tree of used physical eraseblocks
* @erroneous: RB-tree of erroneous used physical eraseblocks
@@ -570,6 +571,7 @@ struct ubi_device {
size_t fm_size;
struct work_struct fm_work;
int fm_work_scheduled;
+ int fast_attach;

/* Wear-leveling sub-system's stuff */
struct rb_root used;