Re: 2.6.4-mm2

From: Andrew Morton
Date: Tue Mar 16 2004 - 18:22:34 EST


Chris Mason <mason@xxxxxxxx> wrote:
>
> On Tue, 2004-03-16 at 13:32, Daniel McNeil wrote:
> > Andrew,
> >
> > I re-ran six copies of the direct_read_under test on an 8-proc
> > machine last night. All six tests saw uninitialized data.
>
> It is possible to trigger mpage_writepages twice at the same time,
> right?

Yes.

> Say once from sync_sb_inodes and once from filemap_fdatawrite?
> I'm assuming Daniel is hitting the same bug he reported before, a race
> between ll_rw_block from ext3 data=ordered and sychronous writeback from
> fsync or O_DIRECT.

OK, that can happen. Daniel had a fixlet for that and I assume he's
retrying that.

Not only can it happen with ext3, but also with any random filesystem which
does ll_rw_blk() of a random metadata block. sync_blockdev() can miss the
associated page. Conceivably this could leave I/O in flight after umount.

I'm thinking that the right thing to do here is to change submit_bh()
callers and ll_rw_block() to run set_page_writeback(bh->b_page) when they
start the buffer writeout and to do the run-around-the-buffer_heads thing
at I/O completion. Ho hum, that'll take a bit of work but at least it
kills off some exceptionalities.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/