Re: Ext3 Errors on Dell RAID
From: Matt Domsch
Date: Tue Aug 23 2005 - 08:28:29 EST
On Tue, Aug 23, 2005 at 09:05:27AM -0400, Jess Balint wrote:
> Problem:
> I get massive ext3 errors once every few days. See "errors on console"
> section below. Almost all commands return I/O error. I have to power
> cycle the machine to get it running again. Upon reboot, there are
> usually 3 orphan inodes deleted and everything is fine. See "messages
> on reboot" below.
>
> Configuration:
> System: Dell PowerEdge 6300/500, 4 CPU SMP w/2GB memory
> Discs: 3 SCSI discs in a controller-managed striped configuration
> Controller: Dell PERC-2
> kernel messages in "kernel boot messages" below
This looks very familiar, and given the firmware versions you mention,
is probably a known issue. The controller firmware goes to do a cache
flush, but that doesn't complete in a sane amount of time, and
eventually the SCSI midlayer starts aborting commands and taking the
file system offline.
I don't believe a firmware update was released for your add-in PERC2
quad-channel card. Firmware 6091 was released for the PERC3/Di ROMBs
which addresses this exact case, though other failures have been
reported on linux-poweredge@xxxxxxxx (subscribe and read archives at
http://lists.us.dell.com) even with newer firmware.
The workarounds include:
1) disable the read and write cache using afacli.
2) mount file systems using 'noatime'.
3) backup your data, replace the controller with something newer
(disks on the onboard aic7xxx controller combined with Linux Software
RAID works quite well), recreate your RAID array on the new
controller, and restore your data from backups.
Thanks,
Matt
--
Matt Domsch
Software Architect
Dell Linux Solutions linux.dell.com & www.dell.com/linux
Linux on Dell mailing lists @ http://lists.us.dell.com
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/