Re: ext3 on raid5 failure

From: Bas Mevissen
Date: Wed Feb 04 2004 - 04:29:53 EST


Theodore Ts'o wrote:


sfhq:/mnt/data/1/lost+found# ls -l
total 76
d-wSr----- 2 1212680233 136929556 49152 Jun 7 2008 #16370
-rwx-wx--- 1 1628702729 135220664 45056 May 4 1974 #16380


Ok, this looks like random garbage has gotten written into inode table.

If you can make this happen consistently with 2.6 and not with 2.4,
then that would be useful to know. There may be some kind of race
condition or problem with either the raid5 code, or the combination of
raid5 plus ext3. It's unlikely this kind of error would be caused by
a flaw in the ext3 code alone, since this is indicative of complete
garbage written to the inode table, or a block intended for another
location on disk getting written to the inode table. The natural
suspect is at the block device layer and below.


I've seen this kind of problems on my notebook too. Among others, over 600MB of a huge cache directory (from a news reader) was having "funny" permissions. Maybe more files were affected. I used fsck.ext3 and changed the attributes with chmod.

It may be caused by crashes of the notebook (power failure and suspend/resume failure), but I would expect that the administration of the fs would survive that, as it did for years. Actually, the last time I had filesystem corruption was in kernel 1.2.xx days...

As my notebook is not using raid, I suspect something in the ext3 code. Kernel is 2.6.1 with ACPI, acpi-dsdt and swsusp2 patches.

Regards,

Bas.



-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/