Yes. Actually I avoid the stat() when open(name,
O_NOFOLLOW|O_DIRECTORY), with optional fstat() gets equivalent results
in fewer system calls. (System call overhead is significant when the
data is in cache already).
Either way the inodes are read in inumber order.
> If anything, you might want to start readahead on inode tables. Hmm... You
> know, it might work _and_ would be completely independent from VFS. I'll
> look at it.
I don't understand. You mean that reading inode X causes async
readahead for inodes [X+1...X+99] too? Do you have in mind readahead on
inumbers (which are not always contiguous on disk) or on block numbers?
Ideally for my program, I would like to be able to submit a list of
inodes and read them all in in the best order. Over NFS, that might
overlapping requests for example. On ext2 on a local disk, that would
involve feeding a lot of requests to the elevator algorithm so they are
block ordered.
On way is to use threads. As each thread blocks on I/O, so there is
room for another to submit the next request. This should give better
results over a network due to the pipelining, and better results locally
due to block request merging and assistance from the elevator algorithm.
(But only if the elevator algorithm will insert new requests; I am under
the impression it does not).
I tried to simulate readahead/overlap in user space using threads, each
reading a sequential range of inodes pulled from a common list using
atomic operations. But it was a net loss in all cases but one, and in
that case the gain wasn't significant.
-- Jamie
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/