Re: NMI errors in 2.0.30??, High Availability-Linux

Stephen Costaras (stevecs@chaven.com)
Fri, 25 Apr 1997 18:19:54 -0500 (CDT)


On Thu, 24 Apr 1997, Jon Lewis wrote:
> Uhhuh. NMI received. Dazed and confused, but trying to continue
> You probably have a hardware problem with your RAM chips or a
> power saving mode enabled.
>
> I really don't believe the message, as this is a Tomcat IIID (running with
> 2 CPU's but not an SMP kernel), 4 8x36-60 simms, and the setup passed
> several hours of memtest86 before going online. The CMOS setup is
> configured to do ECC and report single bit errors...could this cause
> problems for linux? I always disable all the power saving stuff...so I'd
> say there's at least a 99% chance it's turned off. Is it possible some
> other random kernel bug is at fault?

I am wondering this now as well. I have just upgraded to v2.0.30 on my news
server here and have started receiving several NMI messages as well. I have
had ECC turned on in this machine since day 1 (Tyan S1668, w/ 128MB Parity memory).
and have never seen these messages before on the system. Going back to 2.0.29
I Don't see them. They appear to mainly happen when doing file system checks
on bootup (albeit, this of course could also have been a problem w/ earlier kernels
as I don't boot the server that often perhaps once every 1-2 months).

Anyone else notice anything w/ Tyan boards & ECC interacting with the kernel?
(I'm assuming it's probably some kind of reporting problem as the posts that I've
seen are all tyan based. I'm running Tyan's Award BIOS v3.03.

Stephen Costaras
stevecs@chaven.com