On December 8, 2001 08:16 pm, Ragnar Kj°rstad wrote:
> On Sat, Dec 08, 2001 at 03:15:50AM +0300, Hans Reiser wrote:
> > >>no, actually this is a problem for v3. stat data are time of creation
> > >>ordered (very roughly speaking)
> > >>and directory entries are hash ordered, meaning that ls -l suffers a
> > >>major performance penalty.
> > >
> > >Yes, just remember that file-body ordering also has the same problem.
> > >(ref the "find . -type f | xargs cat > /dev/null" test wich I think
> > >represent maildir performance pretty closely)
> > So is this a deeply inherent drawback of offering readdir name orders
> > that differ hugely from time of creation order?
> It should not be, if:
> * If the cache was smart enough to detect that the user is reading all
> the files in a directory and started reading in files into memory
> ahead of time. It sounds ugly, so I don't know if I like it.
> * file-bodies were ordered by hash as well.
> > I suppose we could use objectids based on the hash of the first
> > assigned filename plus a 60 bit global to the FS counter....
> > but it is too many bits I think. I think that using substantially less
> > than the full hash of the name that is used for directory entry keys
> > doesn't work.... Comments welcome.
> The abould stort the file-bodies in optimal order in the three, but
> block-allocation is a seperate issue, right? Even if block-allocation
> would take objectids into account it would be nearly impossible to get
> the optimal order of the file-bodies, because you don't know the number
> of files and their sizes before allocating the blocks for the first
> files. (unless you would move files around after they are allocated)
> So, I think the _only_ way to get the optimal performance for a growing
> directory is to do allocation and ordering by creating-time.
> That said, maybe trying to find algorithms that are order sub-sets of
> the directories entries in optimal order rather than the whole directory
> is perhaps more constructive? Or look at repackers or other utilities to
> reorder data in batch instead?
> So how is this done in ext2/3 with directory indexes, Daniel? Are there
> any papers available, or would I have to decifer the source?
You should find this useful:
The coherency between inode order and file body order is handled for me
in the existing Ext2 code base. I haven't really dug into that algorithm but
it seems to produce servicable results. Note: Al Viro has taken a look at
improving that code. It's an ongoing project that's been discussed on lkml
As far as coherency between readdir order and inode order goes, I'v just
left that dangling for the moment. This doesn't hurt until we get over a
million files/directory, and then it doesn't hurt an awful lot. As I
mentioned earlier, I think the increased table thrashing exhibited over the
million file mark is more because of shortcomings in icache handling than
In the long run I plan to do some work on inode allocation targets to improve
the correspondence between readdir order and inode order.
-- Daniel - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to firstname.lastname@example.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
This archive was generated by hypermail 2b29 : Sat Dec 15 2001 - 21:00:14 EST