Re: [PATCH 1/2]: Fix BUG in cancel_dirty_pages on XFS

From: David Chinner
Date: Wed Jan 24 2007 - 19:36:18 EST


On Thu, Jan 25, 2007 at 11:12:41AM +1100, Nick Piggin wrote:
> David Chinner wrote:
> >On Thu, Jan 25, 2007 at 12:43:23AM +1100, Nick Piggin wrote:
>
> >>And why not just leave it in the pagecache and be done with it?
> >
> >
> >because what is in cache is then not coherent with what is on disk,
> >and a direct read is supposed to read the data that is present
> >in the file at the time it is issued.
>
> So after a writeout it will be coherent of course, so the point in
> question is what happens when someone comes in and dirties it at the
> worst possible moment? That relates to the paragraph below...
>
> >>All you need is to do a writeout before a direct IO read, which is
> >>what generic dio code does.
> >
> >
> >No, that's not good enough - after writeout but before the
> >direct I/O read is issued a process can fault the page and dirty
> >it. If you do a direct read, followed by a buffered read you should
> >get the same data. The only way to guarantee this is to chuck out
> >any cached pages across the range of the direct I/O so they are
> >fetched again from disk on the next buffered I/O. i.e. coherent
> >at the time the direct I/O is issued.
>
> ... so surely if you do a direct read followed by a buffered read,
> you should *not* get the same data if there has been some activity
> to modify that part of the file in the meantime (whether that be a
> buffered or direct write).

Right. And that is what happens in XFS because it purges the
caches on direct I/O and forces data to be re-read from disk.

Effectively, if you are mixing direct I/O with other types of I/O
(buffered or mmap) then the application really needs to be certain
it is doing the right thing because there are races that can occur
below the filesystem. All we care about in the filesystem is that
what we cache is the same as what is on disk, and that implies that
direct I/O needs to purge the cache regardless of the state it is in....

Hence we need to unmap pages and use truncate semantics on them to
ensure they are removed from the page cache....

Cheers,

Dave.
--
Dave Chinner
Principal Engineer
SGI Australian Software Group
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/