Re: [PATCH] btrfs: scrub: skip PREALLOC extents on RAID stripe-tree

From: Qu Wenruo
Date: Fri Sep 13 2024 - 01:47:47 EST




在 2024/9/13 15:12, Johannes Thumshirn 写道:
On 12.09.24 23:32, Qu Wenruo wrote:


在 2024/9/13 00:03, Johannes Thumshirn 写道:
From: Johannes Thumshirn <johannes.thumshirn@xxxxxxx>

When scrubbing a RAID stripe-tree based filesystem with preallocated
extents, the btrfs_map_block() called by
scrub_submit_extent_sector_read() will return ENOENT, because there is
no RAID stripe-tree entry for preallocated extents. This then causes
the sector to be marked as a sector with an I/O error.

To prevent this false alert don't mark secotors for that
btrfs_map_block() returned an ENOENT as I/O errors but skip them.

This results for example in errors in fstests' btrfs/060 .. btrfs/074
which all perform fsstress and scrub operations. Whit this fix, these
errors are gone and the tests pass again.

Cc: Qu Wenru <wqu@xxxxxxxx>

My concern is, ENOENT can be some real problems other than PREALLOC.
I'd prefer this to be the last-resort method.

Hm but what else could create an entry in the extent tree without having
it in the stripe tree? I can't really think of a situation creating this
layout.

My concern is that, if by some other bug that certain writes didn't
create needed RST entry, we will always treat them as preallocated
during scrub.

Thus it may be better to have a way to distinguish a real missing entry
and preallocated extents.



Would it be possible to create an RST entry for preallocated operations
manually? E.g. without creating a dummy OE, but just insert the needed
RST entries into RST tree at fallocate time?

Let me give it a try. But I'm a bit less happy to do so, as RST already
increases the write amplification.

Well, write amplification is always a big problem for btrfs...

Thanks,
Qu