Re: mmap over nfs leads to excessive system load

From: Andrew Morton
Date: Wed Nov 16 2005 - 16:31:41 EST


Trond Myklebust <trond.myklebust@xxxxxxxxxx> wrote:
>
> On Wed, 2005-11-16 at 11:09 -0800, Andrew Morton wrote:
> > Trond Myklebust <trond.myklebust@xxxxxxxxxx> wrote:
> > >
> > > On Wed, 2005-11-16 at 10:00 -0800, Andrew Morton wrote:
> > >
> > > > That will fix it, but the PageWriteback accounting is still wrong.
> > > >
> > > > Is it not possible to use set_page_writeback()/end_page_writeback()?
> > >
> > > Not really. The pages aren't flushed at this time. We the point is to
> > > gather several pages and coalesce them into one over-the-wire RPC call.
> > > That means we cannot really do it from inside ->writepage().
> > >
> >
> > I still don't get it.
> >
> > Once nfs_writepage() has been called, the page is conceptually "under
> > writeback", yes? In that, at some point in the future, it will be written
> > to backing store.
> >
> > Hence it's perfectly appropriate to run set_page_writepage() within
> > nfs_writepage(). It's a matter of finding the right place for the
> > end_page_writeback().
>
> The point is that the process of flushing has not been started at that
> time, so anybody that calls wait_on_page_writeback() immediately after
> calling writepage() may end up waiting for a very long time indeed
> (probably until the next pdflush).

But block-backed filesytems have the same concern: we don't want to do a
whole bunch of 4k I/Os. Hence the writepages() interface, which is the
appropriate place to be building up these large I/Os.

NFS does nfw_writepages->mpage_writepages->nfs_writepage and to build the
large I/Os it leaves the I/O pending on return from nfs_writepage(). It
appears to flush any pending pages on the exit path from nfs_writepages().

If that's a correct reading then there doesn't appear to be any way in
which there's dangling I/O left to do after nfs_writepages() completes.

If there _is_ dandling I/O left over then that's problematic, and probably
doesn't buy us much in the way of performance benefit.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/