Re: [PATCH 0 of 7] Block/SCSI Data Integrity Support
From: Martin K. Petersen
Date: Thu Jul 17 2008 - 11:36:46 EST
>>>>> "Mike" == Mike Snitzer <snitzer@xxxxxxxxx> writes:
>> I'm testing with XFS and btrfs. Generally doing kernel builds,
>> etc. ext2/3 are still problematic because they modify pages in
Mike> Have you made the ext2/3/4 developers aware of this?
Mike> Shouldn't _any_ filesystem "just work" given that the block
Mike> layer is what is generating the checksums and then verifying
Mike> them on read?
There are a couple of issues. One problem is that pages are no longer
locked down during I/O. Instead the writeback bit is being set to
indicate that I/O is in progress. Not all corners of ext* have been
adapted to that properly. Especially ext2 suffers and often modifies
pages containing metadata while they are in flight. If I remember
correctly, ext2/dir.c hasn't been made aware of writeback at all and
assumes the page lock still works like it used to.
That is normally not a huge problem because the page is being
scheduled for write again shortly thereafter. So the inconsistent
block on disk gets overwritten pretty much instantly. But that kind
of sloppy behavior is a no-go with integrity checking turned on.
There also appears to be some quirks in the page cache in general.
There's something not quite right in clear_page_dirty() /
page_mkwrite() territory. If I sync excessively I can make any fs
keel over. peterz said that an mmapped page is supposed to be
read-only during writeback but that appears to be racy when a forced
sync is involved.
That's my recollection, anyway. I've been busy with the innards of
the integrity code stuff for a couple of months and haven't poked at
the fs/vm issues for a while.
Martin K. Petersen Oracle Linux Engineering
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/