NFS stops dead in 2.3.99pre3

From: Russell King (rmk@arm.linux.org.uk)
Date: Sat Mar 25 2000 - 09:13:50 EST


Hi,

I've been reporting strange NFS oddities in all the 2.3.99 kernels thus
far to this list, but I think this has to be the most serious one yet.

While using xaudio to play mp3s on a machine (tasslehoff) from my NFS
server (flint), the sound glitches (the old problem, reported earlier).
I tracked this down to the client pausing between sending requests to
the server for some reason (I never got a response from l-k).

However, with 2.3.99pre3, it eventually stopped, never to resume, and
I see the following on the network:

  root@tasslehoff:[/root]:<1003> tcpdump -i eth0 port 2049 -s 2048
  tcpdump: listening on eth0
  14:04:03.337916 caramon.arm.linux.org.uk.2810905088 > flint.arm.linux.org.uk.nfs: 116 lookup fh Unknown/1 "var"
  14:04:03.337916 caramon.arm.linux.org.uk.2827682304 > flint.arm.linux.org.uk.nfs: 172 write fh Unknown/1 47 (47) bytes @ 4203018 (4203018)
  14:04:03.337916 flint.arm.linux.org.uk.nfs > caramon.arm.linux.org.uk.2810905088: reply ok 128 lookup fh Unknown/1
  14:04:03.337916 flint.arm.linux.org.uk.nfs > caramon.arm.linux.org.uk.2827682304: reply ok 96 write
  14:04:03.337916 caramon.arm.linux.org.uk.2844459520 > flint.arm.linux.org.uk.nfs: 116 lookup fh Unknown/1 "lock"
  14:04:03.337916 flint.arm.linux.org.uk.nfs > caramon.arm.linux.org.uk.2844459520: reply ok 128 lookup fh Unknown/1
  14:04:04.147944 tasslehoff.arm.linux.org.uk.3188470476 > flint.arm.linux.org.uk.nfs: 124 read fh Unknown/1 4096 bytes @ 3092480 (DF)
  14:04:04.217946 tasslehoff.arm.linux.org.uk.3456905932 > flint.arm.linux.org.uk.nfs: 124 read fh Unknown/1 4096 bytes @ 3158016 (DF)
  14:04:13.338212 caramon.arm.linux.org.uk.2861236736 > flint.arm.linux.org.uk.nfs: 108 getattr fh Unknown/1
  14:04:13.338212 flint.arm.linux.org.uk.nfs > caramon.arm.linux.org.uk.2861236736: reply ok 96 getattr REG 100644 ids 0/0 sz 4203065
  14:04:13.338212 caramon.arm.linux.org.uk.2878013952 > flint.arm.linux.org.uk.nfs: 172 write fh Unknown/1 47 (47) bytes @ 4203065 (4203065)
  14:04:13.338212 flint.arm.linux.org.uk.nfs > caramon.arm.linux.org.uk.2878013952: reply ok 96 write
  14:04:22.948417 raistlin.arm.linux.org.uk.2459697152 > flint.arm.linux.org.uk.nfs: 40 null
  14:04:22.948417 flint.arm.linux.org.uk.nfs > raistlin.arm.linux.org.uk.2459697152: reply ok 24 null
  14:04:23.338424 caramon.arm.linux.org.uk.2894791168 > flint.arm.linux.org.uk.nfs: 108 getattr fh Unknown/1
  14:04:23.338424 flint.arm.linux.org.uk.nfs > caramon.arm.linux.org.uk.2894791168: reply ok 96 getattr REG 100644 ids 0/0 sz 4203112
  14:04:23.338424 caramon.arm.linux.org.uk.2911568384 > flint.arm.linux.org.uk.nfs: 172 write fh Unknown/1 47 (47) bytes @ 4203112 (4203112)
  14:04:23.338424 flint.arm.linux.org.uk.nfs > caramon.arm.linux.org.uk.2911568384: reply ok 96 write
  14:04:23.428425 raistlin.arm.linux.org.uk.3910229159 > flint.arm.linux.org.uk.nfs: 116 getattr fh Unknown/1
  14:04:23.428425 flint.arm.linux.org.uk.nfs > raistlin.arm.linux.org.uk.3910229159: reply ok 96 getattr REG 100600 ids 501/12 sz 1062606
  20 packets received by filter
  0 packets dropped by kernel

Note that the NFS server is still alive and servicing requests from
"caramon", but not "tasslehoff".

There are no messages in the logs on "flint" to indicate why it's not
responding to requests. Tasslehoff contains the following kernel
messages:

  root@tasslehoff:[/root]:<1008> dmesg | tail -n 17
  VFS: Mounted root (ext2 filesystem) readonly.
  Freeing unused kernel memory: 120k init
  Adding Swap: 64508k swap-space (priority -1)
  nfs: server flint not responding, still trying
  nfs: server flint not responding, still trying
  tcpdump uses obsolete (PF_INET,SOCK_PACKET)
  device eth0 entered promiscuous mode
  device eth0 left promiscuous mode
  device eth0 entered promiscuous mode
  device eth0 left promiscuous mode
  device eth0 entered promiscuous mode
  device eth0 left promiscuous mode
  device eth0 entered promiscuous mode
  device eth0 left promiscuous mode
  device eth0 entered promiscuous mode
  device eth0 left promiscuous mode
  nfs: task 3420 can't get a request slot

This behaviour started with the 2.3.99 kernel series (it used to work
100% in 2.3.51, if not 2.3.50 iirc).

Kernel versions:
caramon kernel 2.2.14
flint kernel 2.2.14 knfsd nfs-utils 0.1.5
tasslehoff kernel 2.3.99-pre3
   _____
  |_____| ------------------------------------------------- ---+---+-
  | | Russell King rmk@arm.linux.org.uk --- ---
  | | | | http://www.arm.linux.org.uk/~rmk/aboutme.html / / |
  | +-+-+ --- -+-
  / | THE developer of ARM Linux |+| /|\
 / | | | --- |
    +-+-+ ------------------------------------------------- /\\\ |

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Fri Mar 31 2000 - 21:00:15 EST