Re: O_DIRECT question

From: Denis Vlasenko
Date: Thu Jan 25 2007 - 12:41:04 EST


On Thursday 25 January 2007 16:44, Phillip Susi wrote:
> Denis Vlasenko wrote:
> > I will still disagree on this point (on point "use O_DIRECT, it's faster").
> > There is no reason why O_DIRECT should be faster than "normal" read/write
> > to large, aligned buffer. If O_DIRECT is faster on today's kernel,
> > then Linux' read()/write() can be optimized more.
>
> Ahh but there IS a reason for it to be faster: the application knows
> what data it will require, so it should tell the kernel rather than ask
> it to guess. Even if you had the kernel playing vmsplice games to get
> avoid the copy to user space ( which still has a fair amount of overhead
> ), then you still have the problem of the kernel having to guess what
> data the application will require next, and try to fetch it early. Then
> when the application requests the data, if it is not already in memory,
> the application blocks until it is, and blocking stalls the pipeline.
>
> > (I hoped that they can be made even *faster* than O_DIRECT, but as I said,
> > you convinced me with your "error reporting" argument that reads must still
> > block until entire buffer is read. Writes can avoid that - apps can do
> > fdatasync/whatever to make sync writes & error checks if they want).
>
>
> fdatasync() is not acceptable either because it flushes the entire file.

If you opened a file and are doing only O_DIRECT writes, you
*always* have your written data flushed, by each write().
How is it different from writes done using
"normal" write() + fdatasync() pairs?

> This does not allow the application to control the ordering of various
> writes unless it limits itself to a single write/fdatasync pair at a
> time. Further, fdatasync again blocks the application.

Ahhh shit, are you saying that fdatasync will wait until writes
*by all other processes* to thios file will hit the disk?
Is that thue?

--
vda
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/