Re: scary ext2 filesystem question

tytso@mit.edu
Thu, 7 Jan 1999 14:36:39 -0500


Date: Tue, 05 Jan 1999 20:53:23 PST
From: "Dominic Giampaolo" <dbg@be.com>

Unfortunately whether you write metadata first or not is irrelevant in
the example I gave in the book.

So it sounds like you admit your book *does* have some real problems in
terms of explaining this issue.....

---------------------------------------------------------------------------

The original statement I made was that unexpected results can occur
if there is a crash. This is due to the way in which Linux ext2
handles file system meta data. That statement was based on two
assumptions that I believe to be correct:

- The Linux ext2 file system caches everything including file
system meta-data.

- The Linux block cache does not implement any soft-update
mechanism and flushes blocks as it sees fit (presumably
sorted by block address).

If those two conditions are true then it is possible (although
unlikely) that an application could create two files, fileA and
fileB and even though fileB is created after fileA, only fileB
will exist after a reboot. This can happen if the cache flushes
the meta data blocks related to fileB before the meta data blocks
for fileA. If a power failure occurs after the meta data for fileB
is flushed but before the meta data for fileA then after a reboot,
fsck would properly clean up the file system and fileB would be
created, but not fileA. In some circumstances this is not acceptable.

In this very special case --- where the existence of fileA versus the
existence of fileB matter, but the _contents_ of fileA and fileB don't
matter -- sure, maybe. (You've admitted that with both the Linux scheme
and the BSD scheme the contents of the files are completely up for grabs
unless the application uses fsync.) But I'm not at all convinced that
your scenario comes up all that often. Most of the time, applications
want the filesystem to reliably store the contents of the file, and not
just reliably maintain the fact of whether or not the filename exists in
the directory. After all, the file's existence is a single bit of
information, and the file's data might compromise several megabytes of
data. So is the one bit of data indicating the file's existence (with
respect to another file's existence, no less) more important than the
fact that several megabytes of data is at risk. Yeah, right.

So you admit that your book gave an incorrect example which allegedly
showed how ext2fs's update was less reliable, and then when pushed, you
come up with this contrived example where the file's existence with
respect to another file's existent might be more important than the
actual file's data --- none of which was actually in your book.

I would therefore claim that your book unfairly spread FUD about the
reliaiblity of ext2, and the only example you could give is so
contrived that it isn't an issue in real life. This has caused folks
to quite fairly (IMO) question your motives in writing what you did
the book. Perhaps you weren't trying to trash the ext2 filesystem,
but it seems fairly clear that your arugments weren't clearly well
thought out before you penned that section of the book, and that
doesn't speak well for the quality of your book.

- Ted

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/