2.2.19 + reiserfs 3.5.32 nfsd wait_on_buffer/down_failed

From: Michael Stiller (michael@ping.de)
Date: Tue May 08 2001 - 09:42:43 EST


Hi,

we run a nfs server utilizing 2.2.19 + ReiserFS version 3.5.32 on a
P 3 550 machine. Disk subsystem is a GDT7518RN using 4 UW disks as raid 5
device. After upgrading from 2.2.17 + reiserfs to 2.2.19 we experience
many (very much more than with 2.2.17) problems with our nfs clients
about 12 (linux). Network ist 100Mbit full duplex / switched.
I do not think this is network related, cause ping -f doesnt show any
packet loss.

During not so heavy IO on the exported fs
one nfsd thread seems to be waiting for the disk:

  621 root 1 0 0 0 wait_on_b DW 6.2 0.0 1:49 nfsd

and the other threads are waiting in down_fail:

  610 root 0 0 0 0 down_fail DW 0.0 0.0 1:52 nfsd
  611 root 0 0 0 0 down_fail DW 0.0 0.0 1:40 nfsd
  612 root 0 0 0 0 down_fail DW 0.0 0.0 1:41 nfsd
  613 root 0 0 0 0 down_fail DW 0.0 0.0 1:48 nfsd
  614 root 0 0 0 0 down_fail DW 0.0 0.0 1:45 nfsd
  615 root 0 0 0 0 down_fail DW 0.0 0.0 1:43 nfsd
  616 root 0 0 0 0 down_fail DW 0.0 0.0 1:50 nfsd
  617 root 0 0 0 0 down_fail DW 0.0 0.0 1:42 nfsd
  618 root 0 0 0 0 down_fail DW 0.0 0.0 1:44 nfsd
  619 root 0 0 0 0 down_fail DW 0.0 0.0 1:42 nfsd
  620 root 0 0 0 0 down_fail DW 0.0 0.0 1:47 nfsd
  622 root 0 0 0 0 down_fail DW 0.0 0.0 1:47 nfsd
  623 root 0 0 0 0 down_fail DW 0.0 0.0 1:43 nfsd
  624 root 0 0 0 0 down_fail DW 0.0 0.0 1:48 nfsd
  609 root 0 0 0 0 down_fail DW 0.0 0.0 1:50 nfsd

During this event:

- If i check the disk io with e.g. vmstat 1 the machine is doing about 200 bi
  per second, which is not so much i guess.
- the client machines hang, should be clear:

nfs: server foo is not responding
nfs: server foo still not responding
nfs: server foo OK

Our idea is to revert back to 2.2.17 cause the behaviour was much better.

How can i debug this ? Can i do some tuning ?
Should i revert to some older kernel.
Are there any patches for this problem ? Does anyone has the same or
related problem ?

Any pointer would be useful.

TIA and cheers,

-Michael

-- 
In a world where an admin is rendered useless when the ball in his mouse
has been taken out, its good to know that I know UNIX. 
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Tue May 15 2001 - 21:00:13 EST