Re: [RFC] block integrity: Fix write after checksum calculation problem

From: Chris Mason
Date: Fri Mar 11 2011 - 11:35:05 EST


Excerpts from Darrick J. Wong's message of 2011-03-10 18:57:22 -0500:
> On Tue, Mar 08, 2011 at 03:56:26PM +1100, Dave Chinner wrote:
>
> > On Fri, Mar 04, 2011 at 01:07:24PM -0800, Darrick J. Wong wrote:
> > > On Mon, Feb 28, 2011 at 07:54:05AM -0500, Chris Mason wrote:
> > > > Excerpts from Darrick J. Wong's message of 2011-02-24 13:27:32 -0500:
> > > > > On Thu, Feb 24, 2011 at 12:37:53PM -0500, Chris Mason wrote:
> > > > > > Excerpts from Jan Kara's message of 2011-02-24 11:47:58 -0500:
> > > > > > > On Wed 23-02-11 15:35:11, Chris Mason wrote:
> > > > > > > > Excerpts from Joel Becker's message of 2011-02-23 15:24:47 -0500:
> > > > > > > > > On Tue, Feb 22, 2011 at 11:45:44AM -0500, Martin K. Petersen wrote:
> > > > > > > > > > Also, DIX is only the tip of the iceberg. Many other impending
> > > > > > > > > > technologies feature checksums and require pages to be stable during I/O
> > > > > > > > > > due to checksumming, encryption and so on.
> > > > > > > > > >
> > > > > > > > > > The VM is already trying to do the right thing. We just need the
> > > > > > > > > > relevant filesystems to catch up.
> > > > > > > > >
> > > > > > > > > ocfs2 handles stable metadata for its checksums when feeding
> > > > > > > > > things to the journal. If we're doing pagecache-based I/O, is the
> > > > > > > > > pagecache going to help here for data?
> > > > > > > >
> > > > > > > > Data is much easier than metadata. All you really need is to wait on
> > > > > > > > writeback in file_write, wait on writeback in page_mkwrite, and make
>
> Hrm... I've been looking for a file_write in ext4; was the aio_write function
> pointer what you had in mind here?

Your change to grab_cache_page_write_begin looks good to me, at least
for ext4. For ext3 you have to actually go in and wait for each of the
buffer heads in the page, since ext3 (and reiserfs) will write the buffer heads
directly without using writepage.

Have you confirmed by looking at the block mapping that your crc errors
are from data blocks?

-chris
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/