Re: [patch 2/5] fs: introduce new aops and infrastructure

From: Nick Piggin
Date: Thu Mar 15 2007 - 06:05:31 EST


Dmitriy Monakhov wrote:
Nick Piggin <npiggin@xxxxxxx> writes:


Index: linux-2.6/fs/splice.c
===================================================================
--- linux-2.6.orig/fs/splice.c
+++ linux-2.6/fs/splice.c
@@ -559,7 +559,7 @@ static int pipe_to_file(struct pipe_inod
struct address_space *mapping = file->f_mapping;
unsigned int offset, this_len;
struct page *page;
- pgoff_t index;
+ void *fsdata;
int ret;

/*
@@ -569,13 +569,13 @@ static int pipe_to_file(struct pipe_inod
if (unlikely(ret))
return ret;

- index = sd->pos >> PAGE_CACHE_SHIFT;
offset = sd->pos & ~PAGE_CACHE_MASK;

this_len = sd->len;
if (this_len + offset > PAGE_CACHE_SIZE)
this_len = PAGE_CACHE_SIZE - offset;

+#if 0
/*
* Reuse buf page, if SPLICE_F_MOVE is set and we are doing a full
* page.
@@ -587,86 +587,11 @@ static int pipe_to_file(struct pipe_inod
* locked on successful return.
*/
if (buf->ops->steal(pipe, buf))
- goto find_page;
+#endif

One more note. It's looks like you just disabled all fancy zero copy logic.
Off corse this is just rfc patchset.
But i think where is fundamental problem with it:
Previous logic was following:
1)splice code responsible for: stealing(if possible) and loking the page
2)prepare_write() code responsible for: do fs speciffic stuff

But with new write_begin() logic all steps (grubbing, locking, preparing)
happened internaly inside write_begin() witch doesn't even know about what
kind of data will be copied between write_begin/write_end.
So fancy zero copy logic is impossible :(

Check linux-mm: zero-copy splice is broken anyway, and AFAIKS it cannot
really be fixed to work with the current prepare_write.

I think this can be solved somehow, but i dont know yet, how can this be done
without implementing it inside begin_write().

Actually we could do it with begin_write. All we need to do is set a
flag to say that *pagep contains a page that we can use, with a copy of
the write data on it.

The filesystem would then be able to insert that page into the pagecache
*if* it could handle such an operation.

OTOH, is splice stealing support really important? I guess it could be a
nice win for a small niche of workloads, and we probably don't want to
exclude it by design...

--
SUSE Labs, Novell Inc.
Send instant messages to your online friends http://au.messenger.yahoo.com -
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/