Re: 2.4.17 NFS hangup

From: Athanasius (Athanasius@gurus.tf)
Date: Mon Feb 04 2002 - 08:21:46 EST


On Mon, Feb 04, 2002 at 12:00:30AM +0100, Burj?n G?bor wrote:
> On Sun, Feb 03, Trond Myklebust wrote:
>
> > Are you seeing any kernel log messages about 'Tx FIFO error!' that
> > might indicate that particular code is getting triggered?
>
> No, nothing logged except the NFS related messages. However, after NFS
> hangup I cannot scp from the host, but ssh works... I am beginning to
> think that this is not an NFS issue. Then what could it be?

   I'm seeing something like this as well. Two machines using
BNC/thinwire (yes, I know, waiting on finances to make this better), 2
other machines on the same segment. I use an NFS mount from the server
(jimblewix) on the workstation (emelia) for amongst other things playing
mp3s.
   Machine specs:

        SERVER
        PII-400 @400MHz
        384MB PC100 SDRAM
        eth0: NE2000 (ISA) <--- internal interface
        eth1: 3com509b <--- external interface, NFS traffic NOT on this
        Linux jimblewix 2.4.17 #7 Sat Jan 5 16:15:44 GMT 2002 i686 unknown

        WORKSTATION
        AMD Athlon XP 1600+ 1.4GHz, not overclocked
        512MB PC2100 DDR
        eth0: NE2000 (PCI eth0: NetVin NV5000SC found at 0xdc00, IRQ 11,
        00:40:95:45:91:38.)
        Linux emelia 2.4.18-pre7 #3 Thu Jan 31 07:07:48 GMT 2002 i686 unknown
        ALSO on 2.4.17

Repeatedly I'll have xmms stop playing an mp3 mid-file due to NFS
timeouts. I have the same problem cp'ing large files over the NFS
mounts as well. Currently these are soft mounts. IF I change them to
hard mounts rather than an i/o error on that file and control coming
back the app will just lock hard in D state until a reboot.

/etc/fstab on the WORKSTATION:

192.168.0.162:/home/users on /home/users type nfs (rw,nosuid,nodev,nolock,rsize=8192,wsize=8192,soft,intr,addr=192.168.0.162)
192.168.0.162:/usr/local on /export/miggy-1/usr-local type nfs (rw,nosuid,nodev,rsize=8192,wsize=8192,soft,intr,addr=192.168.0.162)
192.168.0.162:/other on /other type nfs (rw,nosuid,nodev,rsize=8192,wsize=8192,soft,intr,addr=192.168.0.162)

That last one is usually where I'm doing the big cp'ing to/from.

I've just had the problem twice whilst typing this email:

Feb 4 13:07:31 emelia kernel: nfs: server 192.168.0.162 not responding, timed o
ut
Feb 4 13:07:52 emelia last message repeated 2 times
Feb 4 13:12:17 emelia kernel: nfs: server 192.168.0.162 not responding, timed o
ut
Feb 4 13:12:38 emelia last message repeated 2 times

<NOTHING in /var/log/kern.log on jimblewix>

I haven't had any of the following since this line:

kern.log.2.gz:1649:Jan 18 07:39:28 emelia kernel: nfs: task 13016 can't
get a request slot

   Whilst I appreciate that thinnet/BNC isn't the best technology to be
using this segment isn't THAT busy most of the time, certainly not the
majority of times mp3s cut out (ones that WILL play fine end to end at
other times so it's not corruption in them).

  If there any patches/options (other than hard mounts without other
changes) I should be trying please let me know.

thanks,

-Ath

-- 
- Athanasius = Athanasius(at)gurus.tf / http://www.clan-lovely.org/~athan/
                  Finger athan(at)fysh.org for PGP key
	   "And it's me who is my enemy. Me who beats me up.
Me who makes the monsters. Me who strips my confidence." Paula Cole - ME


- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Thu Feb 07 2002 - 21:00:33 EST