Re: [PATCH v2 3/3] btrfs: fix check_chunk_block_group_mappings() to actually iterate all chunks
From: David Sterba
Date: Mon Mar 23 2026 - 14:06:09 EST
On Sat, Mar 14, 2026 at 08:37:41PM +0800, ZhengYuan Huang wrote:
> [BUG]
> A corrupted image with a chunk present in the chunk tree but whose
> corresponding block group item is missing from the extent tree can be
> mounted successfully, even though check_chunk_block_group_mappings()
> is supposed to catch exactly this corruption at mount time. Once
> mounted, running btrfs balance with a usage filter (-dusage=N or
> -dusage=min..max) triggers a null-ptr-deref:
>
> KASAN: null-ptr-deref in range [0x0000000000000070-0x0000000000000077]
> RIP: 0010:chunk_usage_filter fs/btrfs/volumes.c:3874 [inline]
> RIP: 0010:should_balance_chunk fs/btrfs/volumes.c:4018 [inline]
> RIP: 0010:__btrfs_balance fs/btrfs/volumes.c:4172 [inline]
> RIP: 0010:btrfs_balance+0x2024/0x42b0 fs/btrfs/volumes.c:4604
>
> The crash occurs because __btrfs_balance() iterates the on-disk chunk
> tree, finds the orphaned chunk, calls chunk_usage_filter() (or
> chunk_usage_range_filter()), which queries the in-memory block group
> cache via btrfs_lookup_block_group(). Since no block group was ever
> inserted for this chunk, the lookup returns NULL, and the subsequent
> dereference of cache->used crashes.
>
> [CAUSE]
> check_chunk_block_group_mappings() uses btrfs_find_chunk_map() to
> iterate the in-memory chunk map (fs_info->mapping_tree):
>
> map = btrfs_find_chunk_map(fs_info, start, 1);
>
> With @start = 0 and @length = 1, btrfs_find_chunk_map() looks for a
> chunk map that *contains* the logical address 0. If no chunk contains
> logical address 0, btrfs_find_chunk_map(fs_info, 0, 1) returns NULL
> immediately and the loop breaks after the very first iteration,
> having checked zero chunks. The entire verification function is therefore
> a no-op, and the corrupted image passes the mount-time check undetected.
>
> [FIX]
> Replace the btrfs_find_chunk_map() based loop with a direct in-order
> walk of fs_info->mapping_tree using rb_first_cached() + rb_next(),
> protected by mapping_tree_lock. This guarantees that every chunk map
> in the tree is visited regardless of the logical addresses involved.
> Since the mapping_tree itself is accessed under read_lock, no refcount
> manipulation of each map entry is needed inside the loop, so the
> btrfs_free_chunk_map() calls on the map are also removed.
>
> Signed-off-by: ZhengYuan Huang <gality369@xxxxxxxxx>
> ---
> fs/btrfs/block-group.c | 21 ++++++---------------
> 1 file changed, 6 insertions(+), 15 deletions(-)
>
> diff --git a/fs/btrfs/block-group.c b/fs/btrfs/block-group.c
> index 5322ef2ae015..25bd0d058be6 100644
> --- a/fs/btrfs/block-group.c
> +++ b/fs/btrfs/block-group.c
> @@ -2319,29 +2319,22 @@ static struct btrfs_block_group *btrfs_create_block_group_cache(
> */
> static int check_chunk_block_group_mappings(struct btrfs_fs_info *fs_info)
> {
> - u64 start = 0;
> + struct rb_node *node;
> int ret = 0;
>
> - while (1) {
> + read_lock(&fs_info->mapping_tree_lock);
This is called during mount indirectly from open_ctree() and this is
single threaded (partially), so the lock may not be needed. It would be
needed if there's eg. caching thread possibly accessing the same
structures, I haven't looked closely.
> + for (node = rb_first_cached(&fs_info->mapping_tree); node;
> + node = rb_next(node)) {
> struct btrfs_chunk_map *map;
> struct btrfs_block_group *bg;
>
> - /*
> - * btrfs_find_chunk_map() will return the first chunk map
> - * intersecting the range, so setting @length to 1 is enough to
> - * get the first chunk.
> - */
> - map = btrfs_find_chunk_map(fs_info, start, 1);
> - if (!map)
> - break;
> -
> + map = rb_entry(node, struct btrfs_chunk_map, rb_node);
> bg = btrfs_lookup_block_group(fs_info, map->start);
What concerns me is this lookup. Previously the references avoided
taking the big lock. The time the lock is held may add up significanly
for all block groups but as said before it might not be necessary due to
the mount context.
> if (unlikely(!bg)) {
> btrfs_err(fs_info,
> "chunk start=%llu len=%llu doesn't have corresponding block group",
> map->start, map->chunk_len);
> ret = -EUCLEAN;
> - btrfs_free_chunk_map(map);
> break;
> }
> if (unlikely(bg->start != map->start || bg->length != map->chunk_len ||
> @@ -2354,14 +2347,12 @@ static int check_chunk_block_group_mappings(struct btrfs_fs_info *fs_info)
> bg->start, bg->length,
> bg->flags & BTRFS_BLOCK_GROUP_TYPE_MASK);
> ret = -EUCLEAN;
> - btrfs_free_chunk_map(map);
> btrfs_put_block_group(bg);
> break;
> }
> - start = map->start + map->chunk_len;
> - btrfs_free_chunk_map(map);
> btrfs_put_block_group(bg);
> }
> + read_unlock(&fs_info->mapping_tree_lock);
> return ret;
> }
>
> --
> 2.43.0
>