Re: NFS in Linux 2.2.1

Olaf Kirch (okir@monad.swb.de)
Tue, 02 Feb 1999 00:02:44 +0100


On Mon, 01 Feb 1999 12:37:24 PST, H.J. Lu wrote:
> The bad news is when the sync write enabled for the Linux kernel
> NFS server, the NFS write performance on the Linux server drops
> dramatically. The client write speed slows down from around 900KB/s
> to around 60KB/s. It seems something is wrong in the Linux kernel
> when the kernel NFS server does sync write. Does anyone have any
> ideas on that?

Basically, the server-side write gathering code is, er, sub-optimal.
I've played with the delay parameters a lot, but I never arrived at
a satisfactory set of values. 60KBps is still a bit low. I've been
seeing something around 200KBps, but that may have been specific to
my combination of client/NICs etc.

What the code is currently(?) doing is checking whether the write goes
to the same write as the last one. If so, we assume that there are more
to come, and delay syncing the data for a bit. If the inode is still
dirty after that, we call write_inode_now(). This _usually_ works,
but it's in no way foolproof.

The problem is that we can't know in advance whether the next NFS request
we receive is another write call for the same file. The packet is not
being looked at until the current nfsd thread sleeps, which it usually
won't unless it has to wait for disk I/O. So the first opportunity the next
packet will be inspected is when we're already syncing the file/inode.
So nfsd either will have to peek at the packet as it's delivered by
the data_ready network callback (udp only), or do some delaying when
it looks as if it might be possible to cluster the next few calls.

A somewhat improved design might be

if (we should do write gathering) {
static struct wait_queue wait = ...;

if (less than N nfsd's delaying execution)
interruptible_sleep_on_timeout(&inode->i_wait, ...);
if (inode->i_state & I_DIRTY) {
sync file/inode
}
wake_up(&wait);
}

I guess the VFS buffs will come up with something better...

And of course you may have to enable the wgather exports option, even
though I think its on by default.

Olaf

-- 
Olaf Kirch         |  --- o --- Nous sommes du soleil we love when we play
okir@monad.swb.de  |    / | \   sol.dhoop.naytheet.ah kin.ir.samse.qurax

- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.rutgers.edu Please read the FAQ at http://www.tux.org/lkml/