Re: [PATCH v4 3/7] btrfs: split RAID stripes on deletion

From: Qu Wenruo
Date: Mon Jul 08 2024 - 19:02:53 EST




在 2024/7/8 20:22, Johannes Thumshirn 写道:
On 08.07.24 07:26, Johannes Thumshirn wrote:
On 08.07.24 07:20, Qu Wenruo wrote:

Can the ASSERT() be reproduced without a zoned device? (I'm really not a
fan of the existing tcmu emulated solution, meanwhile libvirt still
doesn't support ZNS devices)

If it can be reproduced just with RST feature, I may provide some help
digging into the ASSERT().

Let me check. It's very sporadic as well unfortunately.



OK, I've managed to trigger the failure with btrfs/070 on a
SCRATCH_DEV_POOL with 5 non-zoned devices.

I'm hitting errors like this:

[ 227.898320] ------------[ cut here ]------------
[ 227.898817] BTRFS: Transaction aborted (error -17)
[ 227.899250] WARNING: CPU: 7 PID: 65 at
fs/btrfs/raid-stripe-tree.c:116 btrfs_insert_raid_extent+0x337/0x3d0 [btrfs]
[ 227.900122] Modules linked in: btrfs blake2b_generic xor
zstd_compress vfat fat intel_rapl_msr intel_rapl_common crct10dif_pclmul
crc32_pclmul ghash_clmulni_intel iTCO_wdt iTCO_vendor_support
aesni_intel crypto_simd cryptd psmouse i2c_i801 pcspkr i2c_smbus lpc_ich
intel_agp intel_gtt joydev agpgart mousedev raid6_pq libcrc32c loop drm
fuse qemu_fw_cfg ext4 crc32c_generic crc16 mbcache jbd2 dm_mod
virtio_rng virtio_net virtio_blk virtio_balloon net_failover
virtio_console failover virtio_scsi rng_core dimlib usbhid virtio_pci
virtio_pci_legacy_dev crc32c_intel virtio_pci_modern_dev serio_raw
[ 227.904452] CPU: 7 PID: 65 Comm: kworker/u40:0 Not tainted
6.10.0-rc6-custom+ #167
[ 227.905220] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS
unknown 2/2/2022
[ 227.905827] Workqueue: btrfs-endio-write btrfs_work_helper [btrfs]
[ 227.906558] RIP: 0010:btrfs_insert_raid_extent+0x337/0x3d0 [btrfs]
[ 227.907246] Code: 89 6b 08 e8 4b 18 f7 ff 49 8b 84 24 50 02 00 00 4c
39 f0 75 be 31 db e9 7d fe ff ff 89 de 48 c7 c7 f0 8d 9d a0 e8 29 a1 79
e0 <0f> 0b e9 69 ff ff ff e8 bd 95 3e e1 49 8b 46 60 48 05 48 1a 00 00
[ 227.908277] BTRFS: error (device dm-3 state A) in
btrfs_insert_one_raid_extent:116: errno=-17 Object already exists
[ 227.909356] RSP: 0018:ffffc9000026fca0 EFLAGS: 00010282
[ 227.909361] RAX: 0000000000000000 RBX: 00000000ffffffef RCX:
0000000000000027
[ 227.911934] RDX: ffff88817bda1948 RSI: 0000000000000001 RDI:
ffff88817bda1940
[ 227.912722] RBP: ffff8881029dcbe0 R08: 0000000000000000 R09:
0000000000000003
[ 227.913095] BTRFS info (device dm-3 state EA): forced readonly
[ 227.913569] R10: ffffc9000026fb38 R11: ffffffff826d0508 R12:
0000000000000010
[ 227.915182] R13: ffff8881029dcbe0 R14: ffff88812a5ff790 R15:
ffff8881488f2780
[ 227.916130] FS: 0000000000000000(0000) GS:ffff88817bd80000(0000)
knlGS:0000000000000000
[ 227.916912] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 227.917500] CR2: 0000561364dec000 CR3: 00000001583ca000 CR4:
0000000000750ef0
[ 227.918210] PKRU: 55555554
[ 227.918484] Call Trace:
[ 227.918727] <TASK>
[ 227.918940] ? __warn+0x8c/0x180
[ 227.919299] ? btrfs_insert_raid_extent+0x337/0x3d0 [btrfs]
[ 227.919891] ? report_bug+0x164/0x190
[ 227.920272] ? prb_read_valid+0x1b/0x30
[ 227.920666] ? handle_bug+0x3c/0x80
[ 227.921013] ? exc_invalid_op+0x17/0x70
[ 227.921397] ? asm_exc_invalid_op+0x1a/0x20
[ 227.921835] ? btrfs_insert_raid_extent+0x337/0x3d0 [btrfs]
[ 227.922440] btrfs_finish_one_ordered+0x3c3/0xaa0 [btrfs]
[ 227.923055] ? srso_alias_return_thunk+0x5/0xfbef5
[ 227.923549] ? srso_alias_return_thunk+0x5/0xfbef5
[ 227.924062] btrfs_work_helper+0x107/0x4c0 [btrfs]
[ 227.924612] ? lock_is_held_type+0x9a/0x110
[ 227.925040] process_one_work+0x212/0x720
[ 227.925454] ? srso_alias_return_thunk+0x5/0xfbef5
[ 227.926010] worker_thread+0x1dc/0x3b0
[ 227.926411] ? __pfx_worker_thread+0x10/0x10
[ 227.926918] kthread+0xe0/0x110
[ 227.927377] ? __pfx_kthread+0x10/0x10
[ 227.927776] ret_from_fork+0x31/0x50
[ 227.928151] ? __pfx_kthread+0x10/0x10
[ 227.928564] ret_from_fork_asm+0x1a/0x30
[ 227.929035] </TASK>
[ 227.929305] irq event stamp: 11077
[ 227.929710] hardirqs last enabled at (11085): [<ffffffff8115daf5>]
console_unlock+0x135/0x160
[ 227.930725] hardirqs last disabled at (11094): [<ffffffff8115dada>]
console_unlock+0x11a/0x160
[ 227.931730] softirqs last enabled at (10728): [<ffffffff810b4684>]
__irq_exit_rcu+0x84/0xa0
[ 227.932568] softirqs last disabled at (10723): [<ffffffff810b4684>]
__irq_exit_rcu+0x84/0xa0
[ 227.933494] ---[ end trace 0000000000000000 ]---
[ 227.933992] BTRFS: error (device dm-3 state EA) in
btrfs_insert_one_raid_extent:116: errno=-17 Object already exists
[ 227.935193] BTRFS warning (device dm-3 state EA): Skipping commit of
aborted transaction.
[ 227.936383] BTRFS: error (device dm-3 state EA) in
cleanup_transaction:2018: errno=-17 Object already exists

But not that ASSERT() yet.

I guess I need the first patch to pass this first?

Thanks,
Qu