Re: Is any file system on Linux appropriate for very large directories?

Systemkennung Linux (linux@mailhost.uni-koblenz.de)
Sat, 13 Jul 1996 15:57:50 +0200 (MET DST)


Hi,

> that uses some kind of hashing for name lookup! A quick review of the
> file systems currently available on Linux suggests that the only one
> that uses hashing is the Amiga file system. I don't mean to be
> prejudiced, but it's hard to imagine that the Amiga FS is the going to
> be the best choice for us.

Amiga's FFS performs very bad for large directories. The hashing used
effectivly divides the linear list of directories into 76 lists of
each one 1/76 of the original size. All data in for one file is in
one diskblock, so for a directory with 10000 entries 10000/(76*2) = 657.5
(if memory serves me right - it's a long time that I last hacked on
Amiga filesystems) readaccesses are necessary in average. As you see this
is O(n) which is truely bad. Apparently the design was made with floppy
disks in mind - small directories have very fast access. There are newer
variants of the Amiga FFS which perform better by using directory caches.
These are a bit slower for file creation, much faster other directory
operations and need some percent more diskspace for the directory cache.

I think Steven Tweedie has some plans to speed up ext2fs for large dir-
ectories using hashing, so stay tuned.

Ralf