SCSI Disk Freeze

Ingo Oeser (ingo.oeser@informatik.tu-chemnitz.de)
Sun, 6 Jun 1999 13:31:58 +0200 (CEST)


Hi,

[1.] One line summary of the problem:

SCSI Disk freeze on NCR53c875

[2.] Full description of the problem/report:

After issuing these included messages (s. [X.]) to a remote
logging host (the localhost couldn't write them anymore) the
kernel stops handling processes.

Console switching and magic sysreq still work but seem not to be
able to Sync the affected partition.

It says sth. about "ext2fs: could not {read,write} inode for XXX" and
"ext2fs: could not read block map for XXX".

But I couldn't write them exactly down, because the repeat within
a second or two. klogd/syslogd doesn't see them unfortunatly (I
have an extra syslog host for this).

After a reboot (via magic sysreq) the BIOS says "Boot Failure",
because the SCSI adapter couldn't find the disk anymore.

After a reset, same as above.

After a powercycle (switch off, wait some seconds, switch on
again) everything works fine 'til next occourence of this
phenomeanon.

It happens around every 14 days. I have a moderate load with this
machine running apache and mysql.

[3.] Keywords (i.e., modules, networking, kernel):

kernel, SCSI, disk, hardware, ext2

[4.] Kernel version (from /proc/version):

Linux version 2.2.9 (root@nightmaster.csn.tu-chemnitz.de) (gcc version egcs-2.91.57 19980901 (egcs-1.1 release)) #4 Sun May 30 00:00:19 CEST 1999

[5.] Output of Oops.. message (if applicable) with symbolic information
resolved (see Documentation/oops-tracing.txt)

No Ooops generated this time. But after boot. Ask again if
needed.

[6.] A small shell script or example program which triggers the
problem (if possible)

Not applicable.

[7.] Environment

RedHat 5.2 Linux with some own additions.

[7.1.] Software (add the output of the ver_linux script here)

-- Versions installed: (if some fields are empty or looks
-- unusual then possibly you have very old versions)
Linux nightmaster.csn.tu-chemnitz.de 2.2.9 #4 Sun May 30 00:00:19 CEST 1999 i586 unknown
Kernel modules 2.1.121
Gnu C egcs-2.91.57
Binutils 2.9.1.0.4
Linux C Library 2.0.7
Dynamic linker ldd (GNU libc) 2.0.7
Linux C++ Library 2.9.0
Procps 1.2.7
Mount 2.7l
Net-tools 1.50
Kbd 0.94
Sh-utils 1.16
Modules Loaded nfs lockd sunrpc parport_pc plip parport xfs-2.2.9 3c59x
opl3 sb uart401 sound soundcore

[7.2.] Processor information (from /proc/cpuinfo):

processor : 0
vendor_id : GenuineIntel
cpu family : 5
model : 2
model name : Pentium 75 - 200
stepping : 12
cpu MHz : 150.001442
fdiv_bug : no
hlt_bug : no
sep_bug : no
f00f_bug : yes
coma_bug : no
fpu : yes
fpu_exception : yes
cpuid level : 1
wp : yes
flags : fpu vme de pse tsc msr mce cx8
bogomips : 59.80

[7.3.] Module information (from /proc/modules):

nfs 29880 1 (autoclean)
lockd 30760 1 (autoclean) [nfs]
sunrpc 51780 1 (autoclean) [nfs lockd]
parport_pc 5560 1 (autoclean)
plip 8712 1 (autoclean)
parport 6884 1 (autoclean) [parport_pc plip]
xfs-2.2.9 37600 2
3c59x 18824 1 (autoclean)
opl3 11048 2
sb 32948 2
uart401 5904 2 [sb]
sound 56236 0 [opl3 sb uart401]
soundcore 2148 6 [sb sound]

[7.4.] SCSI information (from /proc/scsi/scsi)

Attached devices:
Host: scsi0 Channel: 00 Id: 00 Lun: 00
Vendor: MICROP Model: 4345WS Rev: x43h
Type: Direct-Access ANSI SCSI revision: 02

[7.5.] Other information that might be relevant to the problem
(please look in /proc and include all information that you
think to be relevant):

excerpt from syslog before syslog stopped working:

<cite src=remote-syslog>
Jun 4 23:40:21 nightmaster.csn.tu-chemnitz.de kernel: scsi : aborting command due to timeout : pid 1239279, scsi0, channel 0, id 0, lun 0 Read (10) 00 00 28
6d 00 00 00 1a 00
Jun 4 23:40:21 nightmaster.csn.tu-chemnitz.de kernel: sym53c8xx_abort: pid=1239279 serial_number=1239286 serial_number_at_timeout=1239286
</cite>

[X.] Other notes, patches, fixes, workarounds:

Please inform me, if this is a (disk-) hardware problem _for_ _sure_,
since I still own warranty time for this disk.

Many thanks

Ingo

-- 
Feel the power of the penguin - run linux@your.pc
<esc>:x

- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.rutgers.edu Please read the FAQ at http://www.tux.org/lkml/