Re: pagevecs of more than PAGEVEC_SIZE pages

From: Evgeniy Polyakov
Date: Sat May 24 2008 - 06:39:35 EST


On Fri, May 23, 2008 at 12:13:58PM -0500, Steve French (smfrench@xxxxxxxxx) wrote:
> Sounds interesting - looks like the only user at the moment is NFS's
> SunRPC right (so no other examples to look at)?
>
> How much better is this than send_msg?

It depends. In usual life with not that big requests and 64bit arch (or
low mem pages on 32 bit x86) difference is marginal, but with high-pages
and huge requests it was additional 10 MB/s for bulk writing in POHMELFS.

> For the write path (coming from writepage and writepages) - we have a
> small header followed by a list of sequential pages (the servers
> either support 4 pages (old windows), 15 pages (some windows and
> NetApp etc. filers), or 31 pages (older Samba and other Windows) or
> 2048 pages (current Samba supports up to 8MB writes, and this may be
> very useful now that Samba can call splice/receivefile and not have to
> do the extra copy)

kernel_sendpage() has only single page argument, but that should not be
a problem, since stack will combine multiple pages into single skb if
needed (if it can, especially for stuff like TSO/GSO), so just loop for
whatever you fetched pages via find_get_pages_tag() and send them
one-by-one, network stack will coalesce them by itself.

But note that return of kernel_sendpage() does not guarantee that page
has been sent, one can free it (drop refcnt, page will not be freed,
since network stack increases counter), but can not write there until
reply from the server is returned, so just lock page and mark it is
being under writeback and then unlock and end_page_writeback() after
CIFSSMBWrite2() just like it is doen right now.

Btw, should'n end_page_writeback() be called with locked page in CIFS?
Iirc in POHMELFS under huge load I was able to lose writeback clearing
probably because of that, but its subtle and can be not the issue is
CIFS.

diff --git a/fs/cifs/file.c b/fs/cifs/file.c
index 8636cec..ba23c65 100644
--- a/fs/cifs/file.c
+++ b/fs/cifs/file.c
@@ -1391,8 +1391,8 @@ retry:
if (rc)
SetPageError(page);
kunmap(page);
- unlock_page(page);
end_page_writeback(page);
+ unlock_page(page);
page_cache_release(page);
}
if ((wbc->nr_to_write -= n_iov) <= 0)

--
Evgeniy Polyakov
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/