Re: 2.4.22pre8 hangs too (Re: 2.4.21-jam1 solid hangs)

From: Ville Herva
Date: Tue Sep 09 2003 - 02:07:58 EST


On Fri, Aug 29, 2003 at 10:57:37PM +0300, you [Ville Herva] wrote:
> On Fri, Aug 29, 2003 at 01:35:25PM -0300, you [Marcelo Tosatti] wrote:
> >
> > So NMI and sysrq doesnt help. I suggest you a few things:
> >
> > Try to make the bug easy to reproduce. Force the Oracle dumps again and
> > again to crash the box.
>
> I happened to work towards that direction this morning (before I read your
> mail). Taking the stance that this very probably had something to do with io
> stress, I played around with several io loads. Eventually I found out that
> fsx on scsi disk reliably caused the box to either lock up or the aic7xxx
> driver to barf. What's more, it took under 15 minutes to trigger.
>
> So I copied the rootfs and everything else from the scsi disk to the ide
> disk (just barely had enough space), and took all the scsi disk partitions
> away from fstab. After reboot, I have been unable to lock it up with fsx
> (scsi disk is not accessed at all), but it will take several weeks before
> I'm confident that the lock up is cured.

And indeed it did lock even though the scsi disk is not used at all. It just
took weeks.

At the time no heavy IO was going on afaict (but there might have been some
io.)

I'm completely out of ideas here. What the heck is the culprit...? Perhaps a
faulty motherboard?

> The hw is:
> Intel 815EEA2LU (i815 Chipset)
> Celeron 1.3GHz (Tualatin)
> Adaptec AHA-2940 / AIC-7871 (NOT USED)
> - Disk SEAGATE Model: ST19171W Rev: 0024 (NOT USED)
> - Tape Drive HP Model: C1537A Rev: L708
> 30GB IDE disk (All fs's here at the moment)



-- v --

v@xxxxxx
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/