nfsd: terminating on error 104 problem
From: Johan van den Dorpe
Date: Wed Feb 25 2004 - 11:57:44 EST
Hi all
We are currently using quite a number of HP DL380 servers within our
company that use the 2.4.25 kernel. These are primarily used for heavy
NFS access, so we keep a large number of nfsd processes concurrently
running. We have noticed over time however that nfsd processes
periodically die. From inspection of the system logs, we get numerous
entries:
Feb 22 12:25:24 ps29 kernel: nfsd: recvfrom returned errno 104
Feb 22 12:25:24 ps29 kernel: nfsd: terminating on error 104
At the moment we cron a script that counts the number of nfsds and
restart rpc.nfsd if they drop below a threshold. Although this is a
working solution, it's not ideal and we would really like to get his
problem patched up properly.
So from my limited knowledge of the kernel source I can see that
"terminating on error 104" corresponds to line 221 of
/usr/src/linux-2.4.25/fs/nfsd/nfssvc.c. So svc_recv on line 191 is
obviously returning -104.
I've noticed that in the 2.6.0 kernel there are quite a few changes to
nfssvc.c, and I wondered if they dealt with this situation.
In the mean time, are there any quick hacks I could add to nfssvc.c to
make it tolerate error -104? Could I safely alter the main request loop
to simply continue execution if svc_recv returns this code?
Any help would be much appreciated.
many thanks
--
Johan van den Dorpe
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/