Re: SMP 2.2.15pre13 unstable on Dell PE1300 - aic7xxx related?

From: Doug Ledford (dledford@redhat.com)
Date: Mon Mar 06 2000 - 15:46:32 EST


"David L. Parsley (lkml account)" wrote:
>
> Hi,
>
> We bought a Dell Poweredge 1300 w/ dual PIII-500 & 256M ram, and have had
> problems with 'sudden death' - it reboots immediately and reports an error
> along the lines of 'Memory error or NMI'. I don't see anything in
> /var/log/messages thats helpful. I tried the stock RH61 smp kernel, then
> 2.3.38ish, and finally 2.2.15pre13.
>
> I booted 2.2.15pre13smp on Saturday, and then floodpinged the machine (we
> think it's interrupt related), and it stayed steady untill today when a
> student went to make a tarball of /home. (the aic7xxx equivalent of a
> floodping?) Took the sucker right down.
>
> Instead of attaching 4 random files/logs/outputs, if anyone is interested
> in this problem I'd be happy to test whatever you like and send whatever
> outputs to whatever commands. The only interesting features of this
> machine: using Intel Etherexpress pro100b (included from Dell) and Dell's
> onboard aic7xxx w/ 2 harddrives, this machine is nfs server for /home.

The aic7xxx driver is pretty heavily tested in regards to it's SMP locking, I
doubt that's your problem. I suspect you actually have bad hardware and that
memory errors under certain types of load are taking your system down. Check
out my web page, I have the script I use to test memory in a machine on it.

-- 

Doug Ledford <dledford@redhat.com> http://people.redhat.com/dledford Please check my web site for aic7xxx updates/answers before e-mailing me about problems

- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.rutgers.edu Please read the FAQ at http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Tue Mar 07 2000 - 21:00:20 EST