Re: v2.4.0test9 NFSv3 server woes Linux-->Solaris

From: Bill Rugolsky Jr. (rugolsky@ead.dsa.com)
Date: Thu Oct 05 2000 - 10:38:26 EST


On Thu, Oct 05, 2000 at 04:58:39PM +0200, David Weinehall wrote:
> Using the NFSv3 server in the v2.4.0test9 kernel (I haven't tested any
> earlier v2.3.xx or v2.4.0testx kernels) I'm having problems with
> (for instance) compile glib.
>
> The setups I've tried are:
>
> wsize = rsize = 1kB
> Linux NFSv3 server --> Linux NFSv3 client (UDP mounted) -- WORKS
>
> wsize = rsize = 32kB
> Linux NFSv3 server --> Solaris NFSv3 client (UDP mounted) -- BROKEN!
> Linux NFSv3 server --> Solaris NFSv3 client (TCP mounted) -- BROKEN!
>
> wsize = rsize = 2kB
> Linux NFSv3 server --> Solaris NFSv3 client (UDP mounted) -- BROKEN!
> Linux NFSv3 server --> Solaris NFSv3 client (TCP mounted) -- BROKEN!

What do you mean by "BROKEN" ? Anything in syslog? tcpdumps?
 
Why not test wsize=rsize=1K for the Linux/Solaris combo?
Also, I was unaware that TCP server was supposed to work in 2.4.0-test9.
(It isn't in the 2.2.x patches.) Are you sure that Solaris is not falling
back to UDP mounts?
 
> Oh, by the way, is there ANY sane reason whatsoever behind the decision
> that the Linux NFSv3 client in the v2.2.18pre15 kernel defaults to wsize
> = rsize = 1kB and the NFSv3 client in v2.4.0test9 defaults to
> wsize = rsize = 4kB?! Every (?) other implementation of NFSv3 defaults
> to 32kB... At least when mounting Solaris NFSv3 server --> Linux NFSv3
> client, 32kB rsize & wsize works perfectly fine (at least for
> v2.2.18pre15, but I hope that v2.4.0test9 isn't worse in this regard.)

The conservatism is similar to that for IDE tuning: many folks have
broken hardware/drivers/networks, and sizes above 1K result in
fragmentation/potential packet loss/RPC timeouts/write errors/corruption.
So for 2.2.x, Alan has decreed 1K size. 2.4.0-test is a bit more aggressive.

32K is fine, if you are using TCP. But I just went through a day-long
session with NetApp after they updated their default UDP size from 8K
to 32K. 32K UDP == 23 fragments. On a switched network that may be
fine, but with a hub and router, it is seeming death. Our Solaris
clients were generating numerous RPC timeouts on writes. After setting
the NetApp F720 server default back to 8K, the timeouts went away.

You may want to take this over to nfs@sourceforge.net; also, tcpdumps
are helpful.

Regards,

   Bill Rugolsky
   rugolsky@ead.dsa.com
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
Please read the FAQ at http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Sat Oct 07 2000 - 21:00:16 EST