XFS+dm-cypt+raid5 - big data loss

From: Tillmann Steinbrecher
Date: Tue Jan 25 2005 - 07:33:34 EST


Hi,

I use a combination of software RAID5, dm-crypt, and XFS as a
filesystem. The filesystem is exported via nfs and samba (both
read-only). Yesterday I lost a large amount of data (420GB). It is
unclear to me what actually caused the problem; there was no crash or
anything. The system is using reliable hardware that never caused
problems in the past.

When I came home from work in the evening, I couldn't access my RAID5
partition - i/o errors. Looking in /var/log/messages, I found out that
at 15:11, XFS suddenly started acting up, with no apparent reason - no
crash, power failure, read error from HD, nothing!

Jan 24 14:16:55 localhost -- MARK --
Jan 24 14:36:55 localhost -- MARK --
Jan 24 14:56:55 localhost -- MARK --
Jan 24 15:06:45 localhost kernel: smb_proc_readdir_long: error=-2,
breaking
Jan 24 15:07:19 localhost kernel: smb_proc_readdir_long: error=-5,
breaking
Jan 24 15:08:38 localhost kernel: smb_proc_readdir_long: error=-2,
breaking
Jan 24 15:11:29 localhost kernel: [xfs_da_do_buf+905/2160]
xfs_da_do_buf+0x389/0x870
Jan 24 15:11:29 localhost kernel: [xfs_da_read_buf+88/96]
xfs_da_read_buf+0x58/0x60
Jan 24 15:11:29 localhost kernel: [xfs_da_read_buf+88/96]
xfs_da_read_buf+0x58/0x60
Jan 24 15:11:29 localhost kernel: [pagebuf_iostart+120/192]
pagebuf_iostart+0x78/0xc0
Jan 24 15:11:29 localhost kernel: [kfree_skbmem+36/48] kfree_skbmem
+0x24/0x30
Jan 24 15:11:29 localhost kernel: [xfs_da_read_buf+88/96]
xfs_da_read_buf+0x58/0x60
Jan 24 15:11:29 localhost kernel: [xfs_dir2_block_lookup_int+82/416]
xfs_dir2_block_lookup_int+0x52/0x1a0
Jan 24 15:11:29 localhost kernel: [xfs_dir2_block_lookup_int+82/416]
xfs_dir2_block_lookup_int+0x52/0x1a0
Jan 24 15:11:29 localhost kernel: [sk_reset_timer+31/48] sk_reset_timer
+0x1f/0x30
Jan 24 15:11:29 localhost kernel: [xfs_bmap_last_offset+197/304]
xfs_bmap_last_offset+0xc5/0x130
Jan 24 15:11:29 localhost kernel: [xfs_dir2_block_lookup+47/208]
xfs_dir2_block_lookup+0x2f/0xd0
Jan 24 15:11:29 localhost kernel: [xfs_dir2_lookup+183/352]
xfs_dir2_lookup+0xb7/0x160
Jan 24 15:11:29 localhost kernel: [copy_to_user+62/80] copy_to_user
+0x3e/0x50

etc etc - there's countless pages like that in the logs.

I rebooted; and I could mount the partition again (XFS did recovery on
mount). The data appeared to be intact (at least what I looked at).
However, I'd still see countless XFS error messages output to the
console. So I unmounted, and ran xfs_check. Output from xfs_check was
over 200MB, including the following messages:

bad magic number 0x2b3c for inode 68753536
bad agi magic # 0x9512d08b in ag 3
bad agi version # 0x2edcb5f7 in ag 3
can't seek in filesystem at bb 16183039256
can't read btree block 3/2000906443
agi_count 1975555954, counted 0 in ag 3
agi_freecount 2021644599, counted 0 in ag 3
agi unlinked bucket 0 is 1009384619 in ag 3 (inode=1009384619)
bad directory data magic # 0x4ffc0558 for dir ino 1026624136 block 0
no . entry for directory 1026624136
no .. entry for directory 1026624136

I also ran xfs_check with -s option; output is small, see end of this
mail.

Ok, so I ran xfs_repair to fix this... BAD IDEA. xfs_repair also
complained about many problems, and finally did this:

empty data block 4180 in directory inode 1663015222: junking block
bad hash table for directory inode 1663015222 (no leaf entry):
rebuilding
rebuilding directory inode 1663015222

At this point, it started using more and more memory, until main memory
(768MB) and swap (2GB) were exhausted, and xfs_check was killed by the
out-of-memory-killer, leaving behind a damaged filesystem with illegal
directory names - 420GB lost, thank you.

I'm not sure what caused the original problem, it may be xfs, dm-crypt,
or raid; however the behavior of xfs_check is quite a desaster. I also
knew that xfs' behavior was problematic in case of power failure during
operation, but this was not what happened here. The filesystem just
corrupted itself without a reason.

Of course I have a complete backup on DVD-R/CD, but it will take days to
restore. I've now switched to reiser4 on this large data partition - at
least they ADMIT it's experimental (sorry for sounding a bit trollish in
the last sentence, but I'm VERY VERY annoyed).

Please email replies to me, I'm not subscribed. If anybody would like to
see more logs to investigate, please email me.

bye,
Till

Output of xfs_check -s :
can't seek in filesystem at bb 16183039256
bad magic number 0x880a for inode 750785168
bad magic number 0x9452 for inode 750785169
bad magic number 0xdb92 for inode 750785170
bad magic number 0x2749 for inode 750785171
bad magic number 0x5200 for inode 750785172
bad magic number 0x37fa for inode 750785173
bad magic number 0x10 for inode 750785174
bad magic number 0xd801 for inode 750785175
bad magic number 0x8aec for inode 750785176
bad magic number 0xd047 for inode 750785177
bad magic number 0x7142 for inode 750785178
bad magic number 0x76a for inode 750785179
bad magic number 0xcab8 for inode 750785180
bad magic number 0x4c91 for inode 750785181
bad magic number 0x441a for inode 750785182
bad magic number 0xb553 for inode 750785183
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/