On Thu, Jan 25, 2007 at 01:01:09PM +1100, Nick Piggin wrote:
David Chinner wrote:
No. The only thing that will happen here is that the direct read
will see _none_ of the write because the mmap write occurred during
the DIO read to a different set of pages in memory. There is no
"some" or "all" case here.
But if the buffers get partially or completely written back in the
meantime, then the DIO read could see that.
Only if you can dirty them and flush them to disk while the direct
read is waiting in the I/O queue (remember, the direct read flushes
dirty cached data before being issued). Given that we don't lock the
inode in the buffered I/O *writeback* path, we have to stop pages being
dirtied in the page cache up front so we don't have mmap writeback
over the top of the direct read.
Hence we have to prevent mmap for dirtying the same file offset we
are doing direct reads on until the direct read has been issued.
i.e. we need a barrier.
IOWs, at a single point in time we have 2 different views
of the one file which are both apparently valid and that is what
we are trying to avoid. We have a coherency problem here which is
solved by forcing the mmap write to reread the data off disk....
I don't see why the mmap write needs to reread data off disk. The
data on disk won't get changed by the DIO read.
No, but the data _in memory_ will, and now when the direct read
completes it will data that is different to what is in the page
cache. For direct I/O we define the correct data to be what is on
disk, not what is in memory, so any time we bypass what is in
memory, we need to ensure that we prevent the data being changed
again in memory before we issue the disk I/O.