Re: SMP "Aiee" in 2.0.24

Leonard N. Zubkoff (lnz@dandelion.com)
Thu, 31 Oct 1996 12:49:43 -0800


I'm responding to both of these together...

Date: Mon, 28 Oct 1996 09:19:01 -0700
From: Bill Reynolds <bill@caulfield.rt66.com>

Got this under and SMP box while running a torture test: a huge (2G
virtual mem) job beating on the machine, with another machine hitting
it with a 'ping -f -s 4' on an isolated 10B/T network. After a two
days of running, we got the message. Note the box is still running,
and doesn't seem to be causing any trouble (we did have to resuscitate
the interface by manually bringing it up and down).

Oct 23 10:07:01 jack kernel: reserved: 0000

Date: Thu, 31 Oct 1996 14:51:32 -0500 (EST)
From: Minjui Huang <huang@cycds5.nscl.msu.edu>

My dual pentium pro 200 (Tyan 440FX motherboard) has the following
problem after one day's uptime with SMP enabled in 2.0.24.
(It also happens in 2.0.22, 2.0.23, didn't try lower versions)
The machine still works, but it "crawls"!
Especially, I can't shutdown the machine by : # shutdown -r now
The machine will stop at some unmounting process and I need to reboot it.
Without SMP enabled in 2.0.23, I still saw "Aiee" sometimes,
but the performance of the machine was not degraded.

Oct 31 08:48:54 www kernel: reserved: 0000

Most likely you are seeing a bug in the P6 CPU Local APIC that cuases spurious
interrupts to incorrectly be delivered as exception 15. I first discovered
this problem back in July without understanding why it was happening. Later,
someone reported that Intel had described it in the July P6 CPU Errata. Linus
installed a patch in 2.0.11 to merely print a message when this happens.
Unfortunately, one person reported a problem with the patch, so Linus removed
it in 2.0.22 despite my recommendation not to (see below for original
discussion). Restore the patch from 2.0.11 and you will see occasional
messages if the trap happens, but there shouldn't be any ill effects.

Or perhaps I should prepare a more precise patch now that we understand the
problem better.

Leonard