Re: large directory handling speed

Alexander Viro (viro@math.psu.edu)
Sat, 29 May 1999 19:39:04 -0400 (EDT)


On Sun, 30 May 1999, Alexander V. Lukyanov wrote:

> On Thu, May 27, 1999 at 01:37:31PM -0400, Alexander Viro wrote:
> > On Tue, 25 May 1999, Alexander V. Lukyanov wrote:
> > > So, would it be possible to cache file information somehow in vfat directory
> > > reading routine - create dentry or something?
> >
> > It may be unnecessary now. Could you try the version in 2.3.2 or later?
> > I believe that I had removed a bottleneck in fat_get_entry() and it might
> > reduce a problem. There is a way to make fat_readdirx() and friends
> > faster, but I suspect that the main sucker was fat_get_entry().
>
> ls -l > /dev/null in a smaller (but still huge) directory:
> 2.2.9:
> 1.32user 489.66system
> 2.3.3:
> 1.14user 240.97system
>
> wow, 2x improvement! But as it is quadratic algorithm, the readdir must be 1.4
> times faster.

> It is certainly nice improvement, but ls is still quadratic.

Hmm... It is quadratic on ext2 too. I'm very sceptical about populating
dcache on readdir() for any filesystem, mostly because it will be useless
on small directories and will trigger cache reaping on the large ones. The
same applies to icache. We can (somewhat) speed the things up if we'll
play with a small cache for unicode translation buffers and it's probably
worth doing, but I'm not sure that it will give any dramatic improvement.
The thing will remain quadratic anyway... We might keep a directory cache
as it is done for NFS &co, but it would hardly help in case of ls -l...
Another possible change would be to remember the postion of last lookup
and on further lookups do a cyclic search from the next one. That can be
done, but I'm not sure that it is worth doing. I will do an update to
FAT when the thing will go to 2.2. Then I'll try those variants. The last
one seems to be the most promising and could be applied to other
filesystems too. Possible drawback being that with the current setup we
can slightly optimize things if we know which files are likely to be
accessed - put them in the beginning. With such change we lose it. OTOH
dcache may be enough here...

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/