Re: 2.0.36 Lockups with and without OOPS

Edward Muller (edwardam@home.com)
Mon, 05 Apr 1999 11:21:50 -0400


This is a multi-part message in MIME format.
--------------17EF0D94A40B839F5660F0A4
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit

I think I fixed the problem....

On Friday Night I went in with another linux box and wrote a script to copy
data across the network...lots of data...over and over again....(NFS/SMB
copies)...The server locked up....I started looking around the
system..../proc/interrupts showed me that the intel 10/100 nic and the
megaraid (the two most usesd devices) were using the same interrupts....

Why I didn't check something like this earlier I have no idea....Remeber that
the machine was working fine for about 2 weeks before the lockups started
happening.....SO I ASSUMED (Make and ASS out of U and ME) that the hardware
configuration was OK....but that a piece of hardware might be failing......

Rebooting and using Intels SSU (System Setup Utility) that comes with the
N440BX Motherboard showed me that for some reason three different cards
(Comtrol Rocket Port, AMI Megaraid & the Integrated Intel 10/100 NIC) were all
using IRQ 11. It appears at some point the configuration data was reset on the
motherboard and seperate IRQs were not assigned to the
PCI devices....I actually had to run the Intel SSU and assign different IRQs
to each device....The motherboard did not handle it for me...Even though it
knew I was not using a PNP OS.....

Also note that the Comtrol Rocket Port Card (a PCI card) did not and
DOES NOT appear under /proc/interrupts..using the latest stable version of
their driver (1.15 I believe)..The Intel SSU utility shows that the card is
using an interrupt... ?????

That leads me to my next question...(or two)...What is the state of
IRQ sharing for PCI devices under 2.0.X and 2.2.X kernels...What's different
between the two and which is 'better'....

Alan Cox wrote:
> This morning I got a call from them again....Screen was

> > blank...enter..enter...enter did nothing...network was down...completely
> > dead...They had to power off and power back on again......
> >
> > What's next...Any ideas...
>
> That tends to make me think hardware is the next suspect since its a 2.0.x
> kernel.
>
> > I'm going to replace the box with a temporary unit for the time
> > being...and bring the box to my home office where I can pound on it a
> > little and NOT worry about it dying....I'm going to copy the software to
> > the new box and If I still get crashes it is definetly software (which
> > I doubt)...The new box will not have all the same hardware..but I need to
> > get something stable to my clients...
>
> If the new box works and the old one doesn't that helps to know too
>
> Alan

--
Edward Muller
Waste Not Computers & Supplies
94 Washington Ave.
Dumont, NJ 07628
(201) 384-4444 x204
(201) 384-4024 (fax)
(201) 906-4207 (cel)
edwardm@wastenotcomputers.com
edwardam@home.com

--------------17EF0D94A40B839F5660F0A4 Content-Type: text/x-vcard; charset=us-ascii; name="edwardam.vcf" Content-Transfer-Encoding: 7bit Content-Description: Card for Edward Muller Content-Disposition: attachment; filename="edwardam.vcf"

begin:vcard n:Muller;Edward tel;cell:(201) 906-4207 tel;fax:(201) 384-4024 tel;home:(973) XXX-XXXX tel;work:(201) 384-4444 x204 x-mozilla-html:TRUE url:http://www.wastenotcomputers.com/users/edwardm org:Waste Not Computers & Supplies;PC/Networking adr:;;94 Washington Ave..;Dumont;NJ;07628;USA version:2.1 email;internet:edwardm@wastenotcomputers.com title:Senior Consultant note;quoted-printable:Alt:=0D=0Aedwardam@home.com x-mozilla-cpt:;-8864 fn:Edward Muller end:vcard

--------------17EF0D94A40B839F5660F0A4--

- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.rutgers.edu Please read the FAQ at http://www.tux.org/lkml/