Re: 2.1.60 NFS probs...

Jonathan Stanton (jonathan@cs.jhu.edu)
Tue, 28 Oct 1997 10:38:14 -0500 (EST)


Hi,

I've been running the 2.1.60 kernel on a dual Pent II system for
several days now and I have noticed a possibly related problem with NFS.
The NFS server is a BSDI/3.0 box. I don't have his problems with startup
of nfs, but I have had the same problem with an nfs intensive writes
causeing a process to get stuck.

Senerio:
1) Running X, use Netscape for awhile (my home directory is NFS'd
so my netscape cache is over nfs). After switching to a new page with
lots of grphics netscape hangs.
2) checking it with ps -l, I see it is stuck in 'nfs_delete' in
the 'D' state. I think the function name is cut off by the limited field
size, from looking at the source I think it is nfs_delete_inode() in
fs/nfs/inode.c.
3) From the source it looks like the client has entered an
uninterruptable loop, and even though there is a timout of (5*HZ) after
waiting about half an hour it hasn't recovered.
4) The netscape process is unkillable, kill -9, root kill, nothing
I could do would get rid of it. Since it was eating 12M of my 32 M system
I rebooted and when unmounting the file systems I got errors like
these.

> RPC: sendmsg returned error 101

The netscape stuck problem happened again after the reboot, so I don't
think it was a fluke.

On Tue, 28 Oct 1997, Matthew Kirkwood wrote:

> Hi,
>
> I'm experiencing some difficulty with NFS-root in 2.1.60 (+Bill Hawes
> nfs_client60-patch):
[..]
>
> However, the following rather painless message can be repeatably
> provoked by ( cd /usr/src/linux ; make menuconfig ), among other
> things:
> NFS: invalidating pending RPC requests
>
> Once this has happened, that process hangs, and any logins on other
> consoles fail also. Stopping and starting the NFS server fixes this
> and ps shows a
> /bin/sh scripts/Menuconfig arch/i386/config.in
> in the "D" state and nothing else of particular interest.
> "ps l" claims that every process but itself is in "end" which I find
> vaguely confusing.

I think that is because the 'correct' System.map file is not in /boot (on
Redhat 4.2). When I had a 2.0.30 System.map file there I got 'end' for all
processes, but when I copied the current 2.1.60 System.map I got correct
looking functions.

Jonathan

-------------------------------------------------------
Jonathan R. Stanton jonathan@cs.jhu.edu
Dept. of Computer Science Finger for PGP key
Johns Hopkins University
PGP: 13 EE B7 5D 3B F9 5E 9C 41 30 AA 78 30 54 59 D0
-------------------------------------------------------