Re: File system corruption after updating dir metadata, moving entry inside to another dir and removing the first dir if system crashes

From: Filipe Manana

Date: Thu Apr 09 2026 - 11:45:43 EST


On Thu, Apr 9, 2026 at 10:22 AM Slava0135
<slava.kovalevskiy.2014@xxxxxxxxx> wrote:
>
> Detailed description
> ====================
>
> Hello, there seems to be an issue with btrfs crash behavior:
>
> 1. Create and sync directories `dir1`, `dir1/dir2` and `dir3`.
> 2. Open `dir1` and update directory metadata using the descriptor (e.g.
> futimens or fchmod).
> 3. Sync `dir1` with fsync.
> 4. Rename `dir1/dir2` to `dir3/dir2`.
> 5. Remove `dir1`.
> 6. Sync `dir1` with fsync.
>
> After system crash (e.g. power failure) mounting file system results in
> critical error, caused by corrupt leaf (`invalid nlink: has 2 expect no
> more than 1 for dir`).

Fixed here:

https://lore.kernel.org/linux-btrfs/3f193ac4b0faadc24d718c3633f8c1a9c61a687c.1775747553.git.fdmanana@xxxxxxxx/

Thanks.

>
>
> System info
> ===========
>
> Linux version 7.0.0-rc7
>
>
> How to reproduce
> ================
>
> Test:
>
> ```
> #include <errno.h>
> #include <fcntl.h>
> #include <stdio.h>
> #include <string.h>
> #include <sys/stat.h>
> #include <sys/types.h>
> #include <unistd.h>
>
> int main() {
> int status;
> int dir_fd;
>
> status = mkdir("dir1", S_IRWXU | S_IRWXG | S_IROTH | S_IXOTH);
> printf("MKDIR: %d\n", status);
>
> status = mkdir("dir1/dir2", S_IRWXU | S_IRWXG | S_IROTH | S_IXOTH);
> printf("MKDIR: %d\n", status);
>
> status = mkdir("dir3", S_IRWXU | S_IRWXG | S_IROTH | S_IXOTH);
> printf("MKDIR: %d\n", status);
>
> sync();
>
> status = open("dir1", O_RDONLY | O_DIRECTORY);
> printf("OPEN: %d\n", status);
> dir_fd = status;
>
> status = futimens(dir_fd, NULL);
> printf("FUTIMENS: %d\n", status);
>
> status = fsync(dir_fd);
> printf("FSYNC: %d\n", status);
>
> status = rename("dir1/dir2", "dir3/dir2");
> printf("RENAME: %d\n", status);
>
> status = rmdir("dir1");
> printf("RMDIR: %d\n", status);
>
> status = fsync(dir_fd);
> printf("FSYNC: %d\n", status);
> }
> ```
>
> mount:
>
> ```
> root@ubuntu:~# mount -t btrfs /dev/vdb /mnt/fstest
> mount: /mnt/fstest: can't read superblock on /dev/vdb.
> dmesg(1) may have more information after failed mount system call.
> ```
>
> dmesg:
>
> ```
> [ 23.230911] BTRFS: device fsid e045376a-9e24-44ff-b677-1a89d9288309
> devid 1 transid 9 /dev/vdb (253:16) scanned by mount (1091)
> [ 23.231199] BTRFS info (device vdb): first mount of filesystem
> e045376a-9e24-44ff-b677-1a89d9288309
> [ 23.231204] BTRFS info (device vdb): using crc32c checksum algorithm
> [ 23.232581] BTRFS info (device vdb): start tree-log replay
> [ 23.233057] page: refcount:2 mapcount:0 mapping:000000006e29dee3
> index:0x1d00 pfn:0x11bc9b
> [ 23.233062] memcg:ffff88810039c380
> [ 23.233065] aops:btree_aops ino:1
> [ 23.233071] flags:
> 0x17ffffc800402a(uptodate|lru|private|writeback|node=0|zone=2|lastcpupid=0x1fffff)
> [ 23.233076] raw: 0017ffffc800402a ffffea00045dc6c8 ffffea00046c68c8
> ffff888100499a98
> [ 23.233078] raw: 0000000000001d00 ffff888121483b40 00000002ffffffff
> ffff88810039c380
> [ 23.233079] page dumped because: eb page dump
> [ 23.233081] BTRFS critical (device vdb): corrupt leaf: root=5
> block=30408704 slot=10 ino=258, invalid nlink: has 2 expect no more than
> 1 for dir
> [ 23.234545] BTRFS info (device vdb): leaf 30408704 gen 10 total ptrs
> 17 free space 14878 owner 5
> [ 23.234548] item 0 key (256 INODE_ITEM 0) itemoff 16123 itemsize 160
> [ 23.234549] inode generation 3 transid 9 size 16 nbytes 16384
> [ 23.234550] block group 0 mode 40755 links 1 uid 0 gid 0
> [ 23.234552] rdev 0 sequence 2 flags 0x0
> [ 23.234552] atime 1775723505.0
> [ 23.234554] ctime 1775723523.943656123
> [ 23.234554] mtime 1775723523.943656123
> [ 23.234555] otime 1775723505.0
> [ 23.234556] item 1 key (256 INODE_REF 256) itemoff 16111 itemsize 12
> [ 23.234557] index 0 name_len 2
> [ 23.234558] item 2 key (256 DIR_ITEM 1843588421) itemoff 16077
> itemsize 34
> [ 23.234559] location key (259 1 0) type 2
> [ 23.234560] transid 9 data_len 0 name_len 4
> [ 23.234561] item 3 key (256 DIR_ITEM 2363071922) itemoff 16043
> itemsize 34
> [ 23.234562] location key (257 1 0) type 2
> [ 23.234563] transid 9 data_len 0 name_len 4
> [ 23.234564] item 4 key (256 DIR_INDEX 2) itemoff 16009 itemsize 34
> [ 23.234565] location key (257 1 0) type 2
> [ 23.234565] transid 9 data_len 0 name_len 4
> [ 23.234566] item 5 key (256 DIR_INDEX 3) itemoff 15975 itemsize 34
> [ 23.234567] location key (259 1 0) type 2
> [ 23.234568] transid 9 data_len 0 name_len 4
> [ 23.234568] item 6 key (257 INODE_ITEM 0) itemoff 15815 itemsize 160
> [ 23.234569] inode generation 9 transid 9 size 8 nbytes 0
> [ 23.234570] block group 0 mode 40755 links 1 uid 0 gid 0
> [ 23.234571] rdev 0 sequence 1 flags 0x0
> [ 23.234572] atime 1775723523.943656123
> [ 23.234573] ctime 1775723523.943656123
> [ 23.234573] mtime 1775723523.943656123
> [ 23.234574] otime 1775723523.943656123
> [ 23.234575] item 7 key (257 INODE_REF 256) itemoff 15801 itemsize 14
> [ 23.234576] index 2 name_len 4
> [ 23.234577] item 8 key (257 DIR_ITEM 2676584006) itemoff 15767
> itemsize 34
> [ 23.234577] location key (258 1 0) type 2
> [ 23.234578] transid 9 data_len 0 name_len 4
> [ 23.234579] item 9 key (257 DIR_INDEX 2) itemoff 15733 itemsize 34
> [ 23.234580] location key (258 1 0) type 2
> [ 23.234581] transid 9 data_len 0 name_len 4
> [ 23.234581] item 10 key (258 INODE_ITEM 0) itemoff 15573 itemsize 160
> [ 23.234582] inode generation 9 transid 10 size 0 nbytes 0
> [ 23.234583] block group 0 mode 40755 links 2 uid 0 gid 0
> [ 23.234584] rdev 0 sequence 0 flags 0x0
> [ 23.234585] atime 1775723523.943656123
> [ 23.234586] ctime 1775723523.943656123
> [ 23.234586] mtime 1775723523.943656123
> [ 23.234587] otime 1775723523.943656123
> [ 23.234588] item 11 key (258 INODE_REF 257) itemoff 15559 itemsize 14
> [ 23.234589] index 2 name_len 4
> [ 23.234589] item 12 key (258 INODE_REF 259) itemoff 15545 itemsize 14
> [ 23.234590] index 2 name_len 4
> [ 23.234591] item 13 key (259 INODE_ITEM 0) itemoff 15385 itemsize 160
> [ 23.234592] inode generation 9 transid 10 size 8 nbytes 0
> [ 23.234593] block group 0 mode 40755 links 1 uid 0 gid 0
> [ 23.234593] rdev 0 sequence 1 flags 0x0
> [ 23.234594] atime 1775723523.943656123
> [ 23.234595] ctime 1775723523.943656123
> [ 23.234596] mtime 1775723523.943656123
> [ 23.234596] otime 1775723523.943656123
> [ 23.234597] item 14 key (259 INODE_REF 256) itemoff 15371 itemsize 14
> [ 23.234598] index 3 name_len 4
> [ 23.234599] item 15 key (259 DIR_ITEM 2676584006) itemoff 15337
> itemsize 34
> [ 23.234600] location key (258 1 0) type 2
> [ 23.234600] transid 10 data_len 0 name_len 4
> [ 23.234601] item 16 key (259 DIR_INDEX 2) itemoff 15303 itemsize 34
> [ 23.234602] location key (258 1 0) type 2
> [ 23.234603] transid 10 data_len 0 name_len 4
> [ 23.234604] BTRFS error (device vdb): block=30408704 write time tree
> block corruption detected
> [ 23.235937] BTRFS error (device vdb): error while writing out
> transaction: -5
> [ 23.236989] BTRFS warning (device vdb): Skipping commit of aborted
> transaction.
> [ 23.236992] BTRFS error (device vdb state A): Transaction aborted
> (error -5)
> [ 23.238020] BTRFS: error (device vdb state A) in
> cleanup_transaction:2061: errno=-5 IO failure
> [ 23.238744] BTRFS error (device vdb state EA): failed to recover log
> trees with error: -5
> [ 23.239614] BTRFS error (device vdb state EA): open_ctree failed: -5
> ```
>
> Steps:
>
> 1. Create and mount new btrfs file system in default configuration.
> 2. Change directory to root of the file system and run the compiled test.
> 3. Cause hard system crash (e.g. QEMU `system_reset` command).
> 4. Remount file system after crash.
> 5. Observe that mount fails.
>