Re: [PATCH] btrfs: Fix BTRFS arm64 tagged KASAN false-positive

From: David Sterba

Date: Fri Mar 20 2026 - 18:01:52 EST


On Thu, Mar 19, 2026 at 07:26:34PM +1030, Qu Wenruo wrote:
>
>
> 在 2026/3/19 16:04, Daniel J Blueman 写道:
> > When booting Linux 7.0-rc4 on a Qualcomm Snapdragon X1 with KASAN
> > software tagging with a BTRFS filesystem, we see:
> >
> > BUG: KASAN: invalid-access in xxh64_update (lib/xxhash.c:143 lib/xxhash.c:283)
> > Read of size 8 at addr 7bff000804fe1000 by task kworker/u49:2/138
> > Pointer tag: [7b], memory tag: [b2]
> >
> > CPU: 0 UID: 0 PID: 138 Comm: kworker/u49:2 Not tainted 7.0.0-rc4+ #34 PREEMPTLAZY
> > Hardware name: LENOVO 83ED/LNVNB161216, BIOS NHCN60WW 09/11/2025
> > Workqueue: btrfs-endio-meta simple_end_io_work
> > Call trace:
> > show_stack (arch/arm64/kernel/stacktrace.c:501) (C)
> > dump_stack_lvl (lib/dump_stack.c:122)
> > print_report (mm/kasan/report.c:379 mm/kasan/report.c:482)
> > kasan_report (mm/kasan/report.c:597)
> > kasan_check_range (mm/kasan/sw_tags.c:86 (discriminator 1))
> > __hwasan_loadN_noabort (mm/kasan/sw_tags.c:158)
> > xxh64_update (lib/xxhash.c:143 lib/xxhash.c:283)
> > btrfs_csum_update (fs/btrfs/fs.c:106)
> > csum_tree_block (fs/btrfs/disk-io.c:103 (discriminator 3))
> > btrfs_validate_extent_buffer (fs/btrfs/disk-io.c:389)
> > end_bbio_meta_read (fs/btrfs/extent_io.c:3853 (discriminator 1))
> > btrfs_bio_end_io (fs/btrfs/bio.c:152)
> > simple_end_io_work (fs/btrfs/bio.c:388)
> > process_one_work (./arch/arm64/include/asm/jump_label.h:36 ./include/trace/events/workqueue.h:110 kernel/workqueue.c:3281)
> > worker_thread (kernel/workqueue.c:3353 (discriminator 2) kernel/workqueue.c:3440 (discriminator 2))
> > kthread (kernel/kthread.c:436)
> > ret_from_fork (arch/arm64/kernel/entry.S:861)
> >
> > The buggy address belongs to the physical page:
> > page: refcount:3 mapcount:0 mapping:f1ff00080055dee8 index:0x2467bd pfn:0x884fe1
> > memcg:51ff000800e68ec0 aops:btree_aops ino:1
> > flags: 0x9340000000004000(private|zone=2|kasantag=0x4d)
> > raw: 9340000000004000 0000000000000000 dead000000000122 f1ff00080055dee8
> > raw: 00000000002467bd 43ff00081d0cc6f0 00000003ffffffff 51ff000800e68ec0
> > page dumped because: kasan: bad access detected
> >
> > Memory state around the buggy address:
> > ffff000804fe0e00: 7b 7b 7b 7b 7b 7b 7b 7b 7b 7b 7b 7b 7b 7b 7b 7b
> > ffff000804fe0f00: 7b 7b 7b 7b 7b 7b 7b 7b 7b 7b 7b 7b 7b 7b 7b 7b
> >> ffff000804fe1000: b2 b2 b2 b2 b2 b2 b2 b2 b2 b2 b2 b2 b2 b2 b2 b2
> > ^
> > ffff000804fe1100: b2 b2 b2 b2 b2 b2 b2 b2 b2 b2 b2 b2 b2 b2 b2 b2
> > ffff000804fe1200: b2 b2 b2 b2 b2 b2 b2 b2 b2 b2 b2 b2 b2 b2 b2 b2
> >
> > This occurs as contiguous pages may have different KASAN tags in the upper address
> > bits, leading to a tag mismatch if linear addressing is used.
> >
> > Fix this by treating them as discontiguous.
> >
> > Signed-off-by: Daniel J Blueman <daniel@xxxxxxxxx>
> > Fixes: 397239ed6a6c ("btrfs: allow extent buffer helpers to skip cross-page handling")
> >
> > ---
> > fs/btrfs/extent_io.c | 12 ++++++++++--
> > 1 file changed, 10 insertions(+), 2 deletions(-)
> >
> > diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c
> > index 5f97a3d2a8d7..e2b241fb6c0e 100644
> > --- a/fs/btrfs/extent_io.c
> > +++ b/fs/btrfs/extent_io.c
> > @@ -3517,8 +3517,16 @@ struct extent_buffer *alloc_extent_buffer(struct btrfs_fs_info *fs_info,
> > * At this stage, either we allocated a large folio, thus @i
> > * would only be 0, or we fall back to per-page allocation.
> > */
> > - if (i && folio_page(eb->folios[i - 1], 0) + 1 != folio_page(folio, 0))
> > - page_contig = false;
> > + if (i > 0) {
> > + struct page *prev = folio_page(eb->folios[i - 1], 0);
> > + struct page *curr = folio_page(folio, 0);
> > +
> > + /*
> > + * Contiguous pages may have different tags; can't be treated as contiguous
> > + */
> > + if (curr != prev + 1 || page_kasan_tag(curr) != page_kasan_tag(prev))
> > + page_contig = false;
>
> I am not a fan of this solution.
>
> Although it doesn't affect end users who don't have KASAN soft tag
> enabled, I don't get what we can really get from the different tags.
>
> I mean all those pages are already contig in physical addresses, why we
> can not access the range in one go?
>
> Maybe it will be better to set all pages with the same random tag if
> page_contig is true?

I don't know if there's an interface how to change the tags but adding
one condition that enables a sanitizer to work on some platform does not
sound like a terrible thing. The contiguous pages on our side is an
optimization so it's a special case, I'd rather adapt to the sanitizers
than to let people ignore a warning or have to read a warning that that
one is harmless.