Re: Ok, explained.. (was Re: [PATCH] mm: fix page_mkclean_one)
From: Linus Torvalds
Date: Fri Dec 29 2006 - 12:52:45 EST
On Fri, 29 Dec 2006, Theodore Tso wrote:
> I'm confused. Does this mean that if "fs blocksize"=="VM pagesize"
> this bug can't trigger?
No. Even if there is just a single buffer-head, if the filesystem ever
writes out that _single_ buffer-head out of turn (ie before the VM
actually asks it to, with "->writepage()"), then the same issue will
In fact, a bigger fs blocksize will likely just make this easier to
trigger (although I doubt it makes a big difference), since any
out-of-order buffer flushback will happen for the whole page, rather than
just a part of the page.
So the "problem" really ends up being that the filesystem does flushing
that the VM isn't aware of, so when the VM did "set_page_dirty()" at an
earlier time, the VM _expected_ the "->writepages()" call that happened
much later to write the whole page - but because the FS had flushed things
behind it backs even _before_ the "->writepage" happens, by the time the
VM actually asks for the page to be written out, the FS layer won't
actually write it all out any more.
Blocksize doesn't matter, the only thing that matters is whether something
writes out data on a buffer-cache level, not on a "page cache" level. Ext3
apparently does this in "ordered" data more at least (and hey, I suspect
that the code that tries to release buffer head data might try to do it on
its own too).
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/