Re: [PATCH] mm: fix page_mkclean_one (was: 2.6.19 file contentcorruption on ext3)

From: Linus Torvalds
Date: Sun Dec 24 2006 - 04:30:37 EST




On Sun, 24 Dec 2006, Andrew Morton wrote:
>
> > I now _suspect_ that we're talking about something like
> >
> > - we started a writeout. The IO is still pending, and the page was
> > marked clean and is now in the "writeback" phase.
> > - a write happens to the page, and the page gets marked dirty again.
> > Marking the page dirty also marks all the _buffers_ in the page dirty,
> > but they were actually already dirty, because the IO hasn't completed
> > yet.
> > - the IO from the _previous_ write completes, and marks the buffers clean
> > again.
>
> Some things for the testers to try, please:
>
> - mount the fs with ext2 with the no-buffer-head option. That means either:

[ snip snip ]

This is definitely worth testing, but the exact schenario I outlined is
probably not the thing that happens. It was really meant to be more of an
exmple of the _kind_ of situation I think we might have.

That would explain why we didn't see this before: we simply didn't mark
pages clean all that aggressively, and an app like rtorrent would normally
have caused its flushes to happen _synchronously_ by using msync() (even
if the IO itself was done asynchronously, all the dirty bit stuff would be
synchronous wrt any rtorrent behaviour).

And the things that /did/ use to clean pages asynchronously (VM scanning)
would always actually look at the "young" bit (aka "accessed") and not
even touch the dirty bit if an application had accessed the page recently,
so that basically avoided any likely races, because we'd touch the dirty
bit ONLY if the page was "cold".

So this is why I'm saying that it might be an old bug, and it would be
just the new pattern of handling dirty bits that triggers it.

But avoiding buffer heads and testing that part is worth doing. Just to
remove one thing from the equation.

Linus
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/