Re: O_DIRECT fails in some kernel and FS

From: Stephen Lord (lord@sgi.com)
Date: Sun Feb 03 2002 - 08:40:57 EST


Jeff Garzik wrote:

>On Sat, Feb 02, 2002 at 02:16:41PM -0600, Stephen Lord wrote:
>
>>Can't you fall back to buffered I/O for the tail? OK it complicates the
>>code, probably a lot, but it keeps things sane from the user's point of
>>view.
>>
>
>For O_DIRECT, IMHO you should fail not fallback. You're simply lying
>to the underlying program otherwise.
>

By fallback I mean't just for the tail, not the whole file.

I have been there before. I had to implement the mixed mode buffered/direct
I/O on Unicos because a change in underlying disk subsystems stopped
customer applications from working - the allowed boundaries for
O_DIRECT stopped working when the sales people sold them some new
disks. This also meant you could get most of the speed benefits of
O_DIRECT without having to align your I/O, it also meant really
large I/Os could be made to automatically bypass cache to avoid
cache thrashing.

What we had were two flags, one which indicated use direct I/O, and another
which indicated return an error to user space rather than go through
buffers.
So lie to me and make it work, or don't lie to me options I suppose.

>
>
>In the ibu fs I am hacking on, the idea for O_DIRECT is to fail a read
>if the file is small enough to fit in the inode. If the O_DIRECT
>action is a write, then I will invalidate the data in the inode,
>then follow the standard path (which eventually calls get_block()).
>
>For file tails (a different case from small-file-in-inode), I
>imagine it would be prudent to support O_DIRECT for all actions
>except reading the file tail. If you want to be complicated, you
>could provide userspace with a way to say "this is a dense file"
>and/or simply not create a tail at all...
>
I suspect the reason XFS never did small files in the inode was because of
the problems with implementing mmap and O_DIRECT.

>
> Jeff
>
>
Steve

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Thu Feb 07 2002 - 21:00:27 EST