Re: multiply files in one (was GNU/Linux stance by Richard Stallman)

root (reiser@ceic.com)
Mon, 14 Jun 1999 13:26:24 +0000 (/etc/localtime)


Actually, if you put the file bodies after the metadata in the logical
ordering of a reiserfs tree, it all sort of works.

Hans

Theodore Y. Ts'o writes:
> Date: Mon, 29 Mar 1999 23:20:06 -0800
> From: Larry McVoy <lm@bitmover.com>
>
> : Hm. OK, maybe the problem is that you're thinking about typically
> : small files. Unfortunately I can't find your original message with the
> : histogram in my archives. For my /usr/bin, the median file size is
> : 9216 bytes, which, IIRC, is larger than the median you measured.
>
> No it isn't - that's almost exactly the same size median 6 years after
> I gathered my data. Nice to know that things don't change much.
>
> Yup, that were my numbers as well, which is why I didn't bother with
> storing < 60 byte files in the inode table. They just didn't exist.....
>
> The problem, though, with storing the data inline with directory or the
> inode table is that you start losing when you do directory searches.
> Consider a News server application; if articles less than, say 32k are stored
> in-line, then the amount of disk reading you need to do scan a directory
> with potentially thousands or tens of thousands of entries becomes
> extremely significant. You can use a B-tree approach, but then you
> still end up seeking between the interior nodes and the leaf nodes when
> you are looking for a specific file, and you're back to the old data
> being stored separately from the metadata model.
>
> Bzzt. One I/O for the whole mess. Even if seeks were free, rotational
> delays are not and I/O transactions are not. I want one I/O for the
> whole directory.
>
> Sure, disk reads are "free" when compared to seeks, but only up to
> a certain point --- and the scaling issues when you're talking about a
> news server are truely frightening. (Also note that disk reads being
> free also often require that you have fast amounts of memory to use as
> buffer cache, so that you don't end up pitching disk buffers out to make
> room for additional reads before you have a chance to use them. A
> number of filesystem designs work well if you have large amounts of
> memory available, but not in more normal configurations.)
>
> No, it wouldn't. It would only do so if all the files in a single
> directory were adjacent to each other. Which they typically aren't.
> Many file systems will make files contiguous, but very few (waffle
> is the only one I know of) will make files contguous to each other
> because they are in the same directory and written by the same user.
>
> The bottom line is that it is *very* hard to design a filesystem which
> works well for all access patterns and all applications. Some
> applications will want to look at all of the files in the directory,
> sooner or later (e.g., doing a kernel compile), so reading them all in
> is a big win. Other applications have a gargantuan number of files in a
> directory, and are doing effectively random access amongst those files
> (e.g., a news server). Optimizing down the number of seeks and disk
> block reads for one application will often screw you for another
> application.
>
> The solution? There really aren't any pretty ones, I'm afraid. I've
> often thought that news requires a specialized filesystem (possibly one
> done in user-mode), and that trying to force news to use a
> general-purpose filesystem is a mistake....
>
> - Ted
>
> P.S. <Sarcastic, JUST JOKING mode> Of course, some people feel that
> news is a mistake period, and nothing will save Usenet from its
> infestation of SPAM and AOL users, so we should just give up on it and
> consign it to the sewers of the Internet, like IRC. </Sarcastic, JUST
> JOKING mode>
>
>
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.rutgers.edu
> Please read the FAQ at http://www.tux.org/lkml/

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/