[dm-devel][PATCH -next] dm thin: Use last transaction's pmd->root when commit failed

From: Zhihao Cheng
Date: Thu Dec 08 2022 - 09:07:23 EST


Recently we found a softlock up problem in dm thin pool looking up btree
caused by corrupted metadata:
Kernel panic - not syncing: softlockup: hung tasks
CPU: 7 PID: 2669225 Comm: kworker/u16:3
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996)
Workqueue: dm-thin do_worker [dm_thin_pool]
Call Trace:
<IRQ>
dump_stack+0x9c/0xd3
panic+0x35d/0x6b9
watchdog_timer_fn.cold+0x16/0x25
__run_hrtimer+0xa2/0x2d0
</IRQ>
RIP: 0010:__relink_lru+0x102/0x220 [dm_bufio]
__bufio_new+0x11f/0x4f0 [dm_bufio]
new_read+0xa3/0x1e0 [dm_bufio]
dm_bm_read_lock+0x33/0xd0 [dm_persistent_data]
ro_step+0x63/0x100 [dm_persistent_data]
btree_lookup_raw.constprop.0+0x44/0x220 [dm_persistent_data]
dm_btree_lookup+0x16f/0x210 [dm_persistent_data]
dm_thin_find_block+0x12c/0x210 [dm_thin_pool]
__process_bio_read_only+0xc5/0x400 [dm_thin_pool]
process_thin_deferred_bios+0x1a4/0x4a0 [dm_thin_pool]
process_one_work+0x3c5/0x730

Following process may generate a broken btree mixed with fresh and stale
nodes, which could let dm thin trapped into an infinite loop while looking
up data block:
Transaction 1: pmd->root = A, A->B->C // One path in btree
pmd->root = X, X->Y->Z // Copy-up
Transaction 2: X,Z is updated on disk, Y is written failed.
// Commit failed, dm thin becomes read-only.
process_bio_read_only
dm_thin_find_block
__find_block
dm_btree_lookup(pmd->root)
The pmd->root points to a broken btree, Y may contain stale node pointing
to any block, for example X, which lets dm thin trapped into a dead loop
while looking up Z.

Fix it by setting pmd->root in __open_metadata(), so that dm thin will use
last transaction's pmd->root if commit failed.

Fetch a reproducer in [Link].

Linke: https://bugzilla.kernel.org/show_bug.cgi?id=216790
Fixes: 991d9fa02da0 ("dm: add thin provisioning target")
Signed-off-by: Zhihao Cheng <chengzhihao1@xxxxxxxxxx>
---
drivers/md/dm-thin-metadata.c | 9 +++++++++
1 file changed, 9 insertions(+)

diff --git a/drivers/md/dm-thin-metadata.c b/drivers/md/dm-thin-metadata.c
index 1a62226ac34e..26c42ee183ed 100644
--- a/drivers/md/dm-thin-metadata.c
+++ b/drivers/md/dm-thin-metadata.c
@@ -724,6 +724,15 @@ static int __open_metadata(struct dm_pool_metadata *pmd)
goto bad_cleanup_data_sm;
}

+ /*
+ * For pool metadata opening process, root setting is redundant
+ * because it will be set again in __begin_transaction(). But dm
+ * pool aborting process really needs to get last transaction's
+ * root in case accessing broken btree.
+ */
+ pmd->root = le64_to_cpu(disk_super->data_mapping_root);
+ pmd->details_root = le64_to_cpu(disk_super->device_details_root);
+
__setup_btree_details(pmd);
dm_bm_unlock(sblock);

--
2.31.1