Re: batched write
From: Hans Reiser
Date: Mon Jun 19 2006 - 12:49:59 EST
Andreas Dilger wrote:
>On Jun 17, 2006 10:04 -0700, Andrew Morton wrote:
>
>
>>On Thu, 15 Jun 2006 02:08:32 +0400
>>"Vladimir V. Saveliev" <vs@xxxxxxxxxxx> wrote:
>>
>>
>>
>>>The core of generic_file_buffered_write is
>>>do {
>>> grab_cache_page();
>>> a_ops->prepare_write();
>>> copy_from_user();
>>> a_ops->commit_write();
>>>
>>> filemap_set_next_iovec();
>>> balance_dirty_pages_ratelimited();
>>>} while (count);
>>>
>>>
>>>Would it make sence to rework this code with adding new address_space
>>>operation - fill_pages so that looks like:
>>>
>>>do {
>>> a_ops->fill_pages();
>>> filemap_set_next_iovec();
>>> balance_dirty_pages_ratelimited();
>>>} while (count);
>>>
>>>generic implementation of fill_pages would look like:
>>>
>>>generic_fill_pages()
>>>{
>>> grab_cache_page();
>>> a_ops->prepare_write();
>>> copy_from_user();
>>> a_ops->commit_write();
>>>}
>>>
>>>
>>>
>>There's nothing which leaps out and says "wrong" in this. But there's
>>nothing which leaps out and says "right", either. It seems somewhat
>>arbitrary, that's all.
>>
>>We have one filesystem which wants such a refactoring (although I don't
>>think you've adequately spelled out _why_ reiser4 wants this).
>>
>>To be able to say "yes, we want this" I think we'd need to understand which
>>other filesystems would benefit from exploiting it, and with what results?
>>
>>
>
>With the caveat that I didn't see the original patch, if this can be a step
>down the road toward supporting delayed allocation at the VFS level then
>I'm all for such changes.
>
>
What do you mean by supporting delayed allocation at the VFS level? Do
you mean calling to the FS or maybe just not stepping on the FS's toes
so much or? Delayed allocation is very fs specific in so far as I can
imagine it.
>Lustre goes to some lengths to batch up reads and writes on the client into
>large (1MB+) RPCs in order to maximize performance. Similarly on the
>server we essentially bypass the VFS in order to allocate all of the RPC's
>blocks in one call and do a large bio write in a second. It just isn't
>possible to maximize performance if everything is split into PAGE_SIZE
>chunks.
>
>I believe XFS would benefit from delayed allocation, and the ext3-delalloc
>patches from Alex also provide a large part of the performance wins for
>userspace IO, when they allow large sys_write() and VM cache flush to
>efficiently call into the filesystem to allocate many blocks at once, and
>then push them out to disk in large chunks.
>
>Cheers, Andreas
>--
>Andreas Dilger
>Principal Software Engineer
>Cluster File Systems, Inc.
>
>
>
>
>
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/