Re: (reiserfs) Re: Implementing Meta File information in Linux (and a note at the

Richard Gooch (rgooch@atnf.csiro.au)
Thu, 3 Sep 1998 22:47:42 +1000


Hans Reiser writes:
> Richard Gooch wrote:
>
> > Can you explain why you think that keeping the metadata in a normal
> > file and splitting the data streams in userspace is going to be slow?
>
> Imagine a million small pieces of metadata. Much better to make a
> directory out of it,and let the FS optimize the storage and indexing
> and caching and everything else. The FS is not going to give your
> storage layer enough control via the VFS interface for your storage
> to do its job as well as the FS could.

Fine. Note that I did show that my assumptions were not many streams
and the secondary streams where small, for the case where I'm arguing
in favour of keeping it all in a single file with no kernel/FS tricks.

> > I would have expected the file would have a "header" which describes a
> > series of data streams, including file offsets. Putting the non-data
> > metadata at the front would allow the normal data component to grow
> > without any problems. Even if more metadata was added later, it would
> > likely be small, so it could be easily cached and appended as the
> > normal data grew. Accessing a particular stream is just a seek
> > operation.
>
> What about insertion? This is the killer issue...... Do you want to
> rewrite the entire1GB file to insert 100 bytes?

Again, I expressly ignored this scenario in my assumptions.

As I said:

> > Of course, if you have large metadata, then you're better off making
> > each stream as a file in a directory and letting the FS do the work.
> > The FS code already has one bunch of code to implement growing streams
> > (files), I think we'd want to leverage that instead of implementing
> > yet another level of what is effectively file/directory management
> > inside the FS.

To clarify: I see two situations. The first is where you have a small
number of secondary streams of small size which don't grow much. Here
a single file can suffice.
The second case is where these assumptions break down, and in that
case I think each stream belongs in a file. A userspace API is written
to allow you to deal with the dataset as a single object. For
consistency, even where the above assumptions hold, it's probably
better to use one file per stream.
I see directories being excellent placeholders for grouping
metadata. With reiserFS, this userspace implementation will be very
fast, so it's the logical way to go, IMO.

Regards,

Richard....

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.altern.org/andrebalsa/doc/lkml-faq.html