Re: [PATCH] btrfs: zoned: protect sb_write_pointer() reads with invalidate lock

From: Filipe Manana

Date: Thu Jun 11 2026 - 10:01:47 EST


On Thu, Jun 11, 2026 at 2:33 PM Runyu Xiao <runyu.xiao@xxxxxxxxxx> wrote:
>
> When both zoned superblock log zones are full, sb_write_pointer() reads
> the last superblock page from each zone with read_cache_page_gfp() to
> compare generations. Those reads go through bdev->bd_mapping without
> filemap_invalidate_lock(), even though the same zoned discovery flow
> later reaches btrfs_read_disk_super(), whose final read already takes
> filemap_invalidate_lock(mapping).
>
> A running system can reach this while mounting or scanning a zoned
> filesystem whose superblock log has both zones full. In that state,
> sb_write_pointer() performs two unprotected page-cache reads before
> btrfs_read_disk_super() does its later protected final read.
>
> This leaves the early discovery reads outside the same synchronization
> domain used by set_blocksize() when it changes the block-device mapping
> geometry. As a result, read_cache_page_gfp() can race a concurrent
> block-size/layout update on the same mapping and see inconsistent
> geometry across folio allocation and mapping state.
>
> This issue was found by our static analysis tool while scanning
> read_cache_page_gfp(bdev->bd_mapping, ...) sites for missing
> filemap_invalidate_lock() coverage, and then manually audited on Linux
> v6.18.21. The same synchronization requirement is already enforced for
> the final read in btrfs_read_disk_super().
>
> A focused QEMU KCSAN test then raced the zoned superblock discovery
> path against a set_blocksize-style mapping update on the same
> bdev->bd_mapping. It reported a race between
> blkbszset_update_mapping() and read_cache_page_gfp(), with the read
> side reaching:
>
> sb_write_pointer()
> sb_log_location()
> btrfs_sb_log_location_bdev()
> btrfs_read_disk_super()
>
> Add filemap_invalidate_lock()/unlock() around the two
> read_cache_page_gfp() calls in sb_write_pointer() so the zoned
> superblock discovery path uses the same invalidate-lock contract as the
> final read in btrfs_read_disk_super().
>
> Fixes: 12659251ca5d ("btrfs: implement log-structured superblock for ZONED mode")
> Cc: stable@xxxxxxxxxxxxxxx
> Signed-off-by: Runyu Xiao <runyu.xiao@xxxxxxxxxx>

There's already a patch for this, and it's in linux-next:

https://lore.kernel.org/linux-btrfs/20260521122945.524890-1-lkangn.kernel@xxxxxxxxx/


> ---
> fs/btrfs/zoned.c | 2 ++
> 1 file changed, 2 insertions(+)
>
> diff --git a/fs/btrfs/zoned.c b/fs/btrfs/zoned.c
> index e14a4234954b..edc797a43fb5 100644
> --- a/fs/btrfs/zoned.c
> +++ b/fs/btrfs/zoned.c
> @@ -130,8 +130,10 @@ static int sb_write_pointer(struct block_device *bdev, struct blk_zone *zones,
> u64 bytenr = ALIGN_DOWN(zone_end, BTRFS_SUPER_INFO_SIZE) -
> BTRFS_SUPER_INFO_SIZE;
>
> + filemap_invalidate_lock(mapping);
> page[i] = read_cache_page_gfp(mapping,
> bytenr >> PAGE_SHIFT, GFP_NOFS);
> + filemap_invalidate_unlock(mapping);
> if (IS_ERR(page[i])) {
> if (i == 1)
> btrfs_release_disk_super(super[0]);
> --
> 2.34.1
>