On Mon, 17 Jun 2002, Andreas Dilger wrote:
> On Jun 17, 2002 17:36 -0700, Andrew Morton wrote:
> > You can probably lessen the seek-rate by accessing the files in the correct
> > order. Read all the files from a directory before descending into any of
> > its subdirectories. Can find(1) do that? You should be able to pretty
> > much achieve disk bandwidth this way - it depends on how bad the inter-
> > and intra-file fragmentation has become.
>
> Just FYI - "find -depth" will do that, from find(1):
> -depth Process each directory's contents before the directory itself.
not quite... even with this, find processes a directory's contents in the
order that the filenames appear on disk ... and this will recurse into
subdirs before finishing with all the files.
what i think andrew was suggesting is to process all non-directory entries
before handling the directory entries.
i'm working with something like this:
find -type f -print \
| perl -ne '@c = split(/\//); print $#c . " $_";' \
| sort -k 1,1nr -k 2,2 \
| perl -pe 's#^\d+ ##;' \
> filelist
(deliberately not using perl's sort because i want the thing to scale of
course ;)
that reverse sorts by number of path components first, pathname second.
it seems to be worth another 10% or so improvement on the single reader
test.
i figured handling the leaves first was best... hard to tell if there's
any difference if i remove the "r". (this is on a live system with other
loads perturbing the results unfortunately.)
-dean
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
This archive was generated by hypermail 2b29 : Sun Jun 23 2002 - 22:00:15 EST