1. Try an up to date kernel ;-)
2. Then, try `chattr -R +A directory', which turns off atime updating
on the directory tree. That makes a huge difference for large trees.
3. Consider writing a program to scan the tree in a better inode order,
as getdents() returns the inode numbers. GNU find does not use this
information; perhaps the BSD program does.
[I have a C program here which does something like that, alternating
the inode lstat() sweeps with the directory reading sweeps in inode
number order. It makes a huge difference with some trees, less so
for others.]
4. Maybe BSD/FFS are doing more inode lookahead.
5. I have read that BSD provides access to the type of file when reading
directory entries; that can affect the choice of lstat() order.
Ext2 maintains that information in the filesystem, but Linux does not
provide it through the current getdents() interface.
I can make Squid start up in 6 seconds *total* instead of 300 seconds,
by running my optimised tree scan program on Squid's cache directory first.
The tree scan itself takes about 4 seconds from cold cache. I have
atime updates turned off for that directory.
OTOH, scanning the entire filesystem for `updatedb' still uses tons of
I/O bandwidth, mostly to write atime-updated directory inodes to disk.
An O_NOATIME option would be a big win for this, IMO.
-- Jamie
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/