[PATCH 5.16 0886/1039] btrfs: zoned: unset dedicated block group on allocation failure

From: Greg Kroah-Hartman
Date: Mon Jan 24 2022 - 19:23:33 EST


From: Naohiro Aota <naohiro.aota@xxxxxxx>

commit 1ada69f61c88abb75a1038ee457633325658a183 upstream.

Allocating an extent from a block group can fail for various reasons.
When an allocation from a dedicated block group (for tree-log or
relocation data) fails, we need to unregister it as a dedicated one so
that we can allocate a new block group for the dedicated one.

However, we are returning early when the block group in case it is
read-only, fully used, or not be able to activate the zone. As a result,
we keep the non-usable block group as a dedicated one, leading to
further allocation failure. With many block groups, the allocator will
iterate hopeless loop to find a free extent, results in a hung task.

Fix the issue by delaying the return and doing the proper cleanups.

CC: stable@xxxxxxxxxxxxxxx # 5.16
Signed-off-by: Naohiro Aota <naohiro.aota@xxxxxxx>
Signed-off-by: David Sterba <dsterba@xxxxxxxx>
Signed-off-by: Greg Kroah-Hartman <gregkh@xxxxxxxxxxxxxxxxxxx>
---
fs/btrfs/extent-tree.c | 20 ++++++++++++++++----
1 file changed, 16 insertions(+), 4 deletions(-)

--- a/fs/btrfs/extent-tree.c
+++ b/fs/btrfs/extent-tree.c
@@ -3790,23 +3790,35 @@ static int do_allocation_zoned(struct bt
spin_unlock(&fs_info->relocation_bg_lock);
if (skip)
return 1;
+
/* Check RO and no space case before trying to activate it */
spin_lock(&block_group->lock);
if (block_group->ro ||
block_group->alloc_offset == block_group->zone_capacity) {
- spin_unlock(&block_group->lock);
- return 1;
+ ret = 1;
+ /*
+ * May need to clear fs_info->{treelog,data_reloc}_bg.
+ * Return the error after taking the locks.
+ */
}
spin_unlock(&block_group->lock);

- if (!btrfs_zone_activate(block_group))
- return 1;
+ if (!ret && !btrfs_zone_activate(block_group)) {
+ ret = 1;
+ /*
+ * May need to clear fs_info->{treelog,data_reloc}_bg.
+ * Return the error after taking the locks.
+ */
+ }

spin_lock(&space_info->lock);
spin_lock(&block_group->lock);
spin_lock(&fs_info->treelog_bg_lock);
spin_lock(&fs_info->relocation_bg_lock);

+ if (ret)
+ goto out;
+
ASSERT(!ffe_ctl->for_treelog ||
block_group->start == fs_info->treelog_bg ||
fs_info->treelog_bg == 0);