Re: [BUG] btrfs: dev-replace finishing commit error reaches WARN_ON and panic_on_warn

From: Yifei Chu

Date: Mon May 25 2026 - 22:45:45 EST


Hi Qu,

You are right to call this out. Sorry about that – if the archive you received only contained a hidden ._... AppleDouble file, then the attachment was clearly not useful and looked suspicious.

I checked the file I had locally before sending. On my side, the original tarball does expand to the files I described in the report: README.md, repro_init.c, positive_instrumentation.diff, and the two serial logs plus qemu/result files. So I suspect I managed to trip over a macOS AppleDouble/attachment handling issue somewhere while preparing or sending it, and you ended up seeing the metadata sidecar instead of the real contents.

For reference, the local original tarball I checked is:

btrfs_dev_replace_finishing_commit_warn_panic_20260523.tar.gz
size: 29856 bytes
sha256: 61948cdd4a3ec0ca08b0e25fcec99b3d128558f6d986519029fc36a169d7e1f1

tar -tzf on that local file shows:

btrfs_dev_replace_finishing_commit_warn_panic_20260523/
btrfs_dev_replace_finishing_commit_warn_panic_20260523/README.md
btrfs_dev_replace_finishing_commit_warn_panic_20260523/repro_init.c
btrfs_dev_replace_finishing_commit_warn_panic_20260523/positive_instrumentation.diff
btrfs_dev_replace_finishing_commit_warn_panic_20260523/logs/
btrfs_dev_replace_finishing_commit_warn_panic_20260523/logs/positive_run1.qemu-args.txt
btrfs_dev_replace_finishing_commit_warn_panic_20260523/logs/positive_run1.result.txt
btrfs_dev_replace_finishing_commit_warn_panic_20260523/logs/positive_run1.serial.log
btrfs_dev_replace_finishing_commit_warn_panic_20260523/logs/positive_run2.qemu-args.txt
btrfs_dev_replace_finishing_commit_warn_panic_20260523/logs/positive_run2.result.txt
btrfs_dev_replace_finishing_commit_warn_panic_20260523/logs/positive_run2.serial.log

I have attached a freshly repacked archive here. I created it with macOS copyfile metadata disabled and tar xattrs excluded, and checked that there are no ._* or __MACOSX entries in it.

btrfs_dev_replace_finishing_commit_warn_panic_20260523.clean.tar.gz
size: 28744 bytes
sha256: 30ab240e6a090a0013d444ad1a6f7b27e0a33ea5ae38022a34835352dc205b36

Again, sorry for the noise and for making you spend time on a broken attachment. If this one still does not come through cleanly, I will stop using an attachment and send the small reproducer/patch inline instead.

Thanks,
Chuyifei

On Mon, 25 May 2026 07:47:52 +0930, Qu Wenruo quwenruo.btrfs@xxxxxxx wrote:

在 2026/5/25 00:45, Yifei Chu 写道:

Hello,

Short version: I am reporting a second Btrfs dev-replace error-path bug
found with targeted fault injection, this time in the finishing path.
The injected -EIO is in btrfs_commit_transaction()’s normal error-return
domain, and the injection is placed at the transaction-commit return
boundary. With that rare commit failure made deterministic,
btrfs_dev_replace_finishing() reaches WARN_ON(ret); with panic_on_warn=1
this panics the kernel.

Tested kernel:

v7.1-rc4-640-g79bd2dded182-dirty
commit base 79bd2dded182b1d458b18e62684b7f82ffc682e5
x86_64 QEMU, KASAN config

Reproducer shape:

The initramfs source mounts a single-device Btrfs image on /dev/vda and
starts device replace from source devid 1 to an empty target device /
dev/vdb. The validation patch forces the commit in
btrfs_dev_replace_finishing() to return -EIO.

The point of the injection is not arbitrary state corruption. It makes a
valid transaction-commit error deterministic at this caller, so the
finishing path’s error handling can be tested.

I reproduced this twice under targeted fault injection. The signature is:

BTRFS info (device vda): dev_replace from /dev/vda (devid 1) to /dev/vdb
started
BTRFS error (device vda): AGENT_BTRFS_DEV_REPLACE_FINISH: forcing finish
commit EIO
WARNING: fs/btrfs/dev-replace.c:912 at
btrfs_dev_replace_finishing+0x295/0x13a0
RIP: 0010:btrfs_dev_replace_finishing+0x295/0x13a0
Kernel panic - not syncing: kernel: panic_on_warn set …

This looks related in theme to the dev-replace start-path error
handling, but it is a separate callsite and a different state surface:
the commit happens after scrub returns and before mapping-tree/device-
list updates. I did not find a direct current-upstream fix for this
finishing-phase WARN_ON(ret) site in my local duplicate sweep.

Expected behavior:

The commit error should be propagated and the replace state should be
left consistent, rather than treating the error as a warning-only
invariant. A real fix likely needs to audit target/source device
lifetime and replace state around this finishing path.

The attached tarball includes README.md, repro_init.c,
positive_instrumentation.diff, QEMU args/results, and both full serial logs.

Your attachment looks very malicious.

Firstly the tar.gz is way larger than the only file inside the tar.gz.

Secondly the only file inside that archive is a hidden file, named
“._btrfs_dev_replace_finishing_commit_warn_panic_20260523”.

The file is hidden is already suspicious, and “file” tells me it’s a
“AppleDouble encoded Macintosh file”.

Nothing matches your description.

You have a lot of things to explain.

Thanks,
Chuyifei

Attachment: btrfs_dev_replace_finishing_commit_warn_panic_20260523.clean.tar.gz
Description: Unix tar archive