Re: Hashing and directories

From: Daniel Phillips (phillips@bonn-fries.net)
Date: Fri Apr 27 2001 - 11:20:19 EST


On Thursday 08 March 2001 13:42, Goswin Brederlow wrote:
> >>>>> " " == Pavel Machek <pavel@suse.cz> writes:
> > Hi!
> >
> >> I was hoping to point out that in real life, most systems that
> >> need to access large numbers of files are already designed to
> >> do some kind of hashing, or at least to divide-and-conquer by
> >> using multi-level directory structures.
> >>
> > Yes -- because their workaround kernel slowness.
> >
> > I had to do this kind of hashing because kernel disliked 70000
> > html files (copy of train time tables).
> >
> > BTW try rm * with 70000 files in directory -- command line will
> > overflow.
>
> There are filesystems that use btrees (reiserfs) or hashing (affs) for
> directories.
>
> That way you get a O(log(n)) or even O(1) access time for
> files. Saddly the hashtable for affs depends on the blocksize and
> linux AFAIK only allows far too small block sizes (512 byte) for affs.
> It was designed for floppies, so the lack of dynamically resizing hash
> tables is excused.
>
> What also could be done is to keed directories sorted. Creating of
> files would cost O(N) time but a lookup could be done in
> O(log(log(n))) most of the time with reasonable name distribution.
> This could be done with ext2 without breaking any compatibility. One
> would need to convert (sort all directories) every time the FS was
> mounted RW by an older ext2, but otherwise nothing changes.
>
> Would you like to write support for this?

Hi, I missed this whole thread at the time, ironically, because I was
too busy working on my hash-keyed directory index.

How do you get log(log(n))? The best I can do is logb(n), with
b=large. For practical purposes this is O(1).

The only problem I ran into is the mismatch between the storage order
of the sorted directory and that of the inodes, which causes thrashing
in the inode table. I was able to eliminate this thrashing completely
from user space by processing the files in inode order. I went on to
examine ways of eliminating the thrashing without help from user space,
and eventually came up with a good way of doing that that relies on
setting an inode allocation target that corresponds loosely to the sort
order.

The thrashing doesn't hurt much anyway compared to the current N**2
behaviour. For a million files I saw a 4X slowdown for delete vs
create. Create shows no thrashing because it works in storage order
in the inodes, with the directory blocks themselves being smaller by
a factor of 6-7, so not contributing significantly to the cache
pressure. Compare this to the 150 times slowdown you see with normal
Ext2, creating 100,000 files.

--
Daniel
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Mon Apr 30 2001 - 21:00:18 EST