Re: multiply files in one (was GNU/Linux stance by Richard Stallman)

Richard Gooch (rgooch@atnf.csiro.au)
Tue, 30 Mar 1999 10:05:00 +1000


Larry McVoy writes:
> : > Your still missing the performance implications. In many, many typical
> : > cases (and this will become more and more true as memory gets cheaper),
> : > it would be far faster to eat the whole tarball and explode it into a
> : > bunch of in memory files even to get just a few of the files. Work
> : > through the math - if each file costs you a seek, you only need to get
> : > to 2-3 files before it would be faster to get 'em all.
> :
> : Hm. But do we need a special FS for this? In theory, isn't this
> : covered by a sufficiently aggressive read-ahead mechanism at the block
> : layer and in the drive itself? If the HDD reads in a whole sector into
> : its cache, that will contain a fair number of small files.
>
> THINK. You guys aren't THINKING. It is very FRUSTRATING.

All right, calm down.

> If you have 200 small files and you have to go get 200 inodes and
> 200 blocks to read them in, the inodes and the blocks aren't next to
> each other, that's 400 disk transactions instead of 1.

OK, so please explain the scale of what you have in mind. How big is
the compressed tarball, and how big is it uncompressed?

Do you plan on having a large FS (say /usr) stored in this way, or are
you thinking of a selected subset?

Unless you have a very high compression ratio, why is there so much
benefit of reading in 10 MBytes of a compressed FS rather than just
reading in 10 MBytes of buffers on ext2fs?

Note that I'm thinking about reading-ahead the inodes *and* the
blocks. If you have RAM to burn, then this looks like it should
work. I'm not thinking in terms of reading ahead a few piddling
blocks. I'm talking megabytes or dozens of megabytes.

So with two seek operations, you can read-ahead lots of files and lots
of data.

> THINK. PLEASE. This is the third time I've gone over this point.

So we've having a communications problem. Bear with us while we
attempt to synchronise.

Regards,

Richard....

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/