Re: [PATCH v3] btrfs: validate data reloc tree file extent item members in tree-checker

From: David Sterba

Date: Tue Apr 28 2026 - 11:29:54 EST


On Tue, Apr 28, 2026 at 10:14:40AM +0930, Qu Wenruo wrote:
>
>
> 在 2026/4/28 07:45, Qu Wenruo 写道:
> >
> >
> > 在 2026/4/28 05:54, Teng Liu 写道:
> >> get_new_location() uses BUG_ON() to crash the kernel if the file extent
> >> item it looks up has any of offset, compression, encryption, or
> >> other_encoding set. The data reloc inode is only written by relocation's
> >> own paths -- insert_prealloc_file_extent() and
> >> insert_ordered_extent_file_extent() -- which always leave those four
> >> fields at 0 (the data reloc inode is created with BTRFS_INODE_NOCOMPRESS,
> >> and encryption/other_encoding are reserved-and-zero). Observing a
> >> non-zero value therefore means the leaf decoded from disk does not match
> >> what the kernel wrote, i.e. on-disk corruption. A malformed image can
> >> reach this code via balance and panic the kernel.
> >>
> >> Move the validation into tree-checker's check_extent_data_item(), where
> >> the constraint is enforced when the leaf is read off disk rather than
> >> after relocation has already started. The data reloc tree has a fixed
> >> root id (BTRFS_DATA_RELOC_TREE_OBJECTID) recorded in the extent buffer
> >> header, so check_extent_data_item() has all the information it needs to
> >> apply this check on its own. Report violations via file_extent_err() and
> >> print the four offending values.
> >>
> >> In get_new_location() replace the BUG_ON() with an ASSERT().
> >> The caller in replace_file_extents() already handles non-zero returns
> >> from
> >> get_new_location() by breaking out of the loop without aborting the
> >> transaction, so no caller changes are needed.
> >>
> >> Suggested-by: Qu Wenruo <wqu@xxxxxxxx>
> >> Suggested-by: David Sterba <dsterba@xxxxxxxx>
> >> Reported-by: syzbot+3e20d8f3d41bac5dc9a2@xxxxxxxxxxxxxxxxxxxxxxxxx
> >> Closes: https://syzkaller.appspot.com/bug?extid=3e20d8f3d41bac5dc9a2
> >> Signed-off-by: Teng Liu <27rabbitlt@xxxxxxxxx>
> >
> > Reviewed-by: Qu Wenruo <wqu@xxxxxxxx>
> >
> > And merged.
>
> Unfortunately this tree-checker got triggered during btrfs/061 runs at
> write-time tree-checker, with arm64 64K page size.
>
> The offending file extent is as the following:
>
> [ 536.885066] item 69 key (258 EXTENT_DATA 4063232) itemoff 12400
> itemsize 53
> [ 536.885067] generation 28 type 1
> [ 536.885067] extent data disk bytenr 10512723968 nr 36864
> [ 536.885068] extent data offset 24576 nr 12288 ram 36864
> [ 536.885069] extent compression 0
>
> Note the offset is not zero, and the type is 1 which means it's a
> regular file extent.
>
> So the check is causing false alerts.

I maybe have an idea. The difference from the BUG_ON and the
tree-checker is the context where it's called. In relocation it's
somewere in the middle and there are actions fixing up the offset. OTOH
when this is done in tree-checker the constraints are different.

get_new_location() - verifies offset, compression, ...

The offset corresponds to 'bytenr' and is returned via *new_bytenr to
replace_file_extents() and then updated in the leaf

btrfs_set_file_extent_disk_bytenr(leaf, fi, new_bytenr);

This eventually ends up in in the pre-write check.