Re: Recovering from "Kernel panic: EXT2-fs panic"

Leonard N. Zubkoff (lnz@dandelion.com)
Mon, 30 Sep 1996 10:19:55 -0700


Date: Sun, 29 Sep 1996 14:11:03 -0500
From: Robert Wuest <rwuest@ix.netcom.com>

I've got this 1.3 Gig SCSI drive that's about to go out. It has to warm
up for about 30 minutes when starting cold (not kidding) and for the
most part runs fine until a storm comes along and causes the power to
drop. (Yes, I know, a new drive and/or a UPS would fix this). However,
every now and then, it produces a read error and I get this panic:

SCSI disk error : host 0 channel 0 id 0 lun 0 return code = 28000002
Current error sd08:02: sns = f0 4
ASC=15 ASCQ= 1
Raw sense data:0xf0 0x00 0x04 0x00 0x22 0xef 0x75 0x28 0x00 0x00 0x00
0x00 0x15 0x01 0x00 0x80
scsidisk I/O error: dev 08:02, sector 1245228
Kernel panic: EXT2-fs panic (device 08:02): ext2_read_inode: unable to
read i-node block - inode=155765, block=622614

Here's the meaning of thes sense key and additional info:

|--------+--------------------------------------------------------------------|
| 4h | HARDWARE ERROR. Indicates that the target detected a non- |
| | recoverable hardware failure (for example, controller failure, |
| | device failure, parity error, etc.) while performing the command |
| | or during a self test. |
|--------+--------------------------------------------------------------------|

| 15h 01h DTL WRSOM MECHANICAL POSITIONING ERROR |

So it looks like your disk is quite unhappy when this occurs.

This is a Fujitsu M2651SA on an Adaptec 2940 controller. Kernel is
2.0.21 (although the behaviour has been the same for quite some time,
now). System is a P/100, 48 meg RAM.

Does the SCSI driver retry after an error? When I get an error access a
dos partition on the same drive from within dos, I usually don't get the
error a second time.

The SCSI driver should indeed retry this, but if the retry does not succeed
either, it signals the "scsidisk I/O error" you see.

After this happens, I can't shut down clean, not even my other drives.
It becomes a permanent error and I have to hit reset to recover my
system. Sync locks up, init fails to init, reboot just prints it's
message and does nothing else. Surely this doesn't have to have such
catastrophic effects.

This is now in the realm of the EXT2 file system code, which deals rather badly
with a critical block that cannot be read.

Any advice on how to clean up the error without having to reboot?

Not really.

There's something obviously flakey about your disk. I suggest you replace it
as soon as possible, before it becomes completely unusable.

Leonard