Re: ext3 corruption
From: Molle Bestefich
Date: Sat Aug 12 2006 - 13:21:41 EST
Theodore Tso wrote:
> I'm especially worried about the "70058 files, 39754 blocks used (0%)"
> message at the end of the e2fsck run.
OK, so it looks like the primary block group descriptors got trashed,
and so e2fsck had to fall back to using the backup block group
Good to have backups. It would be very useful to know whether e2fsck
contemplates writing those back as primary BGDs when it's done, but I
couldn't find that in the documentation. Will it?
(Would be good to have the above information in the docs. Perhaps in
a "what does this message mean?" section.)
(Such a section would also help a lot when confronted with the first
message: "Entry blah is a link to directory bluh. Clear? y/n".
Obviously I don't want to "clear" my data. But why is e2fsck
confronting me with that question? Is something wrong with it that it
should be cleared?)
The summary information in the backup block group
descriptors is not backed up, for speed/performance reasons. This is
not a problem, since that information can always be regenerated
trivially from the pass 5 information.
Thanks for the information!
(Would be very helpful to have a copy/paste of the above in the docs too...)
That's what all of "free inodes/blocks/directories count wrong"
messages in your log were all about.
Ah, I'm relieved. The sheer number of messages was an indicator to me
that e2fsck is doing something gruesomely wrong.
The 39754 block used (0%) is just because you were using -n and the
summary information is calculated from the filesystem summary data,
not from the pass5 count information (which was thrown away since you
were running -n and thus declined to fix the results).
Much relieved again, thanks.
I'm wondering why it even tries to use the corrupt information, instead of just:
* reconstructing it from scratch
* not asking the user?
That leaves me a little less relieved once again ;-).
I can imagine accepting a patch which sets a flag if any discrepancies
found in pass 5 are not fixed, and then if the summary information is
Huh? The user didn't request anything, it always prints.
to print a warning message indicating that the summary
information may not be correct.
Even not printing anything would probably be better than knowingly
printing wrong information...
But no, it's not worth it to take
into account changes that -n might make if the user had said "yes".
The complexities that would entail would be huge, and in fact as it is
e2fsck -n does give a fairly accurate report of what what is wrong
with the filesystem. Is it 100% accurate? No, but that was never the
goal of e2fsck -n. If you want that, then use a dm-snapshot, and run
e2fsck on the snapshot....
Agreed. Running a r/w e2fsck on some kind of overlay would be the way
to implement a more useful (for me anyway) version of -n.
But I think dm-snapshot is useless in this case because:
* It must be configured before MD is configured and the filesystem is
created, which I haven't done on this box.
And generally because:
* It's rather good at corrupting the filesystems you store on top of
* Either you have to create snapshots on devices just as big as the
ones being snapshotted, or you'll have to live with the snapshot
failing any time because it's full. There's no good management
framework to help you manage the full/failing situations either.
Thanks a lot for the information!
I take it that it's safe to run e2fsck on the filesystem, then...
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/