Re: I request inclusion of reiser4 in the mainline kernel

From: Hans Reiser
Date: Tue Oct 04 2005 - 16:11:09 EST


Christoph Hellwig wrote:

>On Mon, Sep 26, 2005 at 07:03:52PM +0400, Vladimir V. Saveliev wrote:
>
>
>>> - you still do your plugin mess in ->readpage. honsetly could you
>>> please explain why mpage_readpage{,s} don't work for you?
>>>
>>>
>>The reason is performance. Reiser4 uses a search through the filesystem tree to
>>access metadata of a file.
>>If reiser4 implemented its read/write via generic functions it would have to
>>repeat the search for every page being read/written.
>>It is especially disappointing because reiser4 uses extents with which it is
>>able in most cases to find all file pointers to data blocks with only one search
>>through the tree.
>>Another reason is multithread load.
>>Tree search includes complex part of providing correct concurrent read/write
>>access to the tree including deadlock avoidance algorithm. On multithread
>>filesystem load minimizing of a number of tree searches should have improved
>>filesystem throughput.
>>
>>These are on todo list yet.
>>
>>
>
>Let's start with the read side. You are using the generic sendfile now
>which is the same as the generic read routine as far as the filesystem
>is concerned.
>
I am not sure that is the right thing for us to be doing actually. We
need to abstract things our own way to create a proper abstraction
underneath sys_reiser4() and the general naming system that we are
headed towards. That means we need the "flow" abstraction as the basis
of our code. The flow abstraction is based on the idea that we should
be able to connect to and read or write any stream of bytes, in user or
kernel space. A flow is a stream of bytes. That is useful for
constructing sys_reiser4() and other semantic enhancements that we
plan. If you would like to generalize the vfs interface such that it
reads and writes to and from flows, and user space buffers are just a
special case used by read() and write(), and it is not done page at a
time, that could be good for vfs.

> What more do you need for read performance than a proper
>mpage_readpage(s) that can use the extent-enabled
>
what does extent enabled mean? Do you mean that it passes more than one
page to the fs? Or do you mean that readahead windows are set by the fs
because readahead is affected by file layout? Or both?

> get_blocks callback?
>I'll post a patch for that ASAP, it benefits the other filesystems aswell.
>
>I know the write path is more complex, and doing your on ->writepage(s)
>is fine because that may need to do a lot of FS-specific things. I'd
>really like to see you use generic_file_write or at least the major
>routines it's built from. What's missing for you to do that?
>
Well, to start with, reiser4_write is plugin specific. Writing to a
file might require modifying a file that it inherits its contents from,
for instance.

I think though that there is another consideration, which is that
reiser4 is not licensed just for Linux. We need to create a coherent
body of code that can be ported to operating systems that don't have vfs
layers specially modified for us. The easiest way to do that is to
write our own code for the key things like read and write.

VFS is supposed to allow filesystems to have their own read and write
implementations. That is its defined interface. You should allow us to
use it. Other filesystems that are less ambitious might well want to
use generic code, but reiser4 is not a yellow package with black lettering.

> Note
>that you read/write path will need a major rewrite anyway due to the
>flaws I pointed out, and I think it would benefit both you (less
>maintaince overhead, feature and bug parity) and the other
>filesystem/vfs/vm hackers (less duplicate functionality to worry about,
>improvement you make shared by every filesystem) if you tried to
>converge to use common code.
>
>
>
>
>
>

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/