why nfs server delay 10ms in nfsd_write()?

From: steve
Date: Wed May 18 2005 - 21:52:08 EST


Hi all,
when i build nfs server with linux 2.6.9, i found nfs write performance is very low, and with
tcpdump, i found the time to process write is very long, please refer as follows:

14:59:22.844977 IP 192.168.0.1.3376789825 > linux.site.nfs: 660 write [|nfs]
14:59:22.855134 IP linux.site.nfs > 192.168.0.1.3376789825: reply ok 136 write POST: [|nfs]

the write operation cost nearly 10ms, so i look up the source code and find the following code
in nfsd_write():

{
..
if (atomic_read(&inode-> i_writecount) > 1
|| (last_ino == inode-> i_ino && last_dev == inode-> i_sb-> s_dev)) {
dprintk("nfsd: write defer %d\n", current-> pid);
set_current_state(TASK_UNINTERRUPTIBLE);
schedule_timeout((HZ+99)/100);
dprintk("nfsd: write resume %d\n", current-> pid);
}
..
}

so it will sleep for 10ms if the condition matches.

i have 2 questions:
1.i don't know why do we have to sleep for 10 ms, why not do sync immediately?
2.what will happen if we don't sleep for 10ms?
when i delete these codes, i get a good result, and the write performace improved from 300KB/s to 18MB/s

Regards!
Steve
2005-05-19


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/