More NFS client problems with 2.4.20

From: Joshua Weage
Date: Wed Oct 01 2003 - 09:22:15 EST


This is a follow-up to the previous post on NFS client problems with
the RedHat 2.4.18 - 2.4.20 kernels.

After 3 weeks of no problems, the problem showed up again on a
different cluster node. Four out of 12 nodes have now shown this
problem, but it seems to affect nodes randomly. It always occurs
during heavy NFS writes. A job running on the problem node stops,
waiting for NFS read/writes. All of the other nodes continue to see
the NFS server just fine. If I try to do an ls on the NFS mount point
on the problem node, the shell locks up.

Checking nfsstat, retrans is not going up, however, getattr is
increasing about 2 per sec. Also, the stuck processes always show a
WCHAN of "end".

Any ideas what is going wrong?

Thanks,

Joshua Weage

=====


__________________________________
Do you Yahoo!?
The New Yahoo! Shopping - with improved product search
http://shopping.yahoo.com
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/