Re: nfs problems with 2.6.18-rc1

From: Neil Brown
Date: Sun Jul 16 2006 - 19:26:26 EST

On Thursday July 13, chexum+dev@xxxxxxxxx wrote:
> Hi!
> I recently updated two (old) hosts to 2.6.18-rc1, and started noticing
> weird things with the nfs mounted /home s.

So this is both the client and the server that you upgraded? That
makes is harder to point the finger of blame :-)

> I frequently face EACCESs where a few minutes ago there wasn't any
> problem, and after a retry everything does work again.

I wonder if that is pointing the finger at;a=commitdiff;h=8c7b389e532e964f07057dac8a56c43465544759

as that is a recent change that returns 'EACCES'... but I cannot see
that being relevant in this case as it only affects directories.

> How can I help with tracing this? git bisecting on these machines takes
> at least an hour per step, (and no reasonable connectivity either to
> compile elsewhere much quicker).

The standard answer for tracing nfs problems if 'tcpdump'.
tcpdump -s 0 -w /tmp/trace host $CLIENT and host $SERVER and port 2049

that should show whether the error is coming from the server, or if
the client is generating it all by itself.
If you can get a reasonably small '/tmp/trace', compress it and attach
it to an email.

Also turn on tracing. Something like:
on server
echo 32767 > /proc/sys/sunrpc/nfsd_debug
on client
echo 32767 > /proc/sys/sunrpc/nfs_debug

You can be a bit more selective by only enabling individual flags.
For the server, these are in include/linux/nfsd/debug.h

For the client, they are near the end of include/linux/nfs_fs.h
Not sure which to choose... maybe just all of them.

