Fix or Flaw ?? WAS (Possible cause of "spurious APIC interrupt")

Andre M. Hedrick (hedrick@Astro.Dyer.Vanderbilt.Edu)
Sun, 24 May 1998 00:32:15 -0500 (CDT)


Brian Perkins, Ingo Molnar, Linus............et al.
Hey guys, I think I found the nasty booger..........

Follow along as best that you can,
(have lost my focus about 1/2 way through this message)
(DAMN A.D.D, and my ritalin is at work..........)

On Sat, 23 May 1998, Chris Pirih wrote:

> At 04:44 PM 05/23/1998 -0700, Vadim E. Kogan wrote:
> >In my case (same MB, 2 PPro233) I *don't* have USB enabled (at least
> >it's not in /proc/pci) and I *do* get spurious APIC interrupts. (see my
> >message several hours ago). Let's check what other common hardware we
> >have - maybe we'll be able to find the problem.

ENABLING IO-APIC IRQs
init IO_APIC IRQs
IO-APIC pin 0, 9, 10, 11, 15, 20, 21, 22, 23 not connected.
..MP-BIOS bug: 8254 timer not connected to IO-APIC
..trying to set up timer as ExtINT ... .. (found pin 0) ... works.
nr of MP irq sources: 18.
nr of IO-APIC registers: 24.
testing the IO APIC.......................
.... register #00: 02000000
....... : physical APIC id: 02
.... register #01: 00170011
....... : max redirection entries: 0017
....... : IO APIC version: 0011
.... register #02: 00000000
....... : arbitration: 00
.... IRQ redirection table:
NR Log Phy Mask Trig IRR Pol Stat Dest Deli Vect:
00 0FF 0F 0 0 0 0 0 1 7 51
01 0FF 0F 0 0 0 0 0 1 1 59
02 0FF 0F 0 0 0 0 0 1 1 51
03 0FF 0F 0 0 0 0 0 1 1 69

./linux/arch/i386/kernel/io_apic.c

__initfunc(void setup_ExtINT_pin (unsigned int pin))
{
struct IO_APIC_route_entry entry;

/*
* add it to the IO-APIC irq-routing table:
*/
memset(&entry,0,sizeof(entry));

entry.delivery_mode = dest_ExtINT;
entry.dest_mode = 1; /* logical delivery */
entry.mask = 0; /* unmask IRQ now */
/* entry.dest.logical.logical_dest = 0xff; */ /* all CPUs */
entry.dest.logical.logical_dest = 0x0f; /* Local CPU */

entry.vector = IO_APIC_VECTOR(pin); /* it's ignored */

entry.polarity=0;
entry.trigger=0;

io_apic_write(0x10+2*pin, *(((int *)&entry)+0));
io_apic_write(0x11+2*pin, *(((int *)&entry)+1));
}

./linux/arch/i386/kernel/irq.c

__initfunc(void init_IRQ(void))
{
int i;

/* set the clock to 100 Hz */
outb_p(0x34,0x43); /* binary, mode 2, LSB/MSB, ch 0 */
outb_p(LATCH & 0xff , 0x40); /* LSB */
outb(LATCH >> 8 , 0x40); /* MSB */

for (i=0; i<NR_IRQS; i++) {
irq_desc[i].events = 0;
irq_desc[i].status = 0;
}
/*
* 16 old-style INTA-cycle interrupt gates:
*/
for (i = 0; i < 16; i++)
set_intr_gate(0x20+i,interrupt[i]);

#ifdef __SMP__

/* BLAH BLAH */

/* IPI vector for APIC spurious interrupts */
set_intr_gate(0xff, spurious_interrupt);
#endif
request_region(0x20,0x20,"pic1");
request_region(0xa0,0x20,"pic2");
setup_x86_irq(2, &irq2);
setup_x86_irq(13, &irq13);
}

NOW

CPU0 CPU1
0: 197407 0 XT-PIC timer
1: 3481 4412 IO-APIC-edge keyboard
2: 0 0 XT-PIC cascade
3: 3301 5823 IO-APIC-edge serial
4: 6125 11781 IO-APIC-edge serial
5: 0 2 IO-APIC-edge soundblaster
8: 1 0 IO-APIC-edge rtc
12: 5949 6036 IO-APIC-edge PS/2 Mouse
13: 1 0 XT-PIC fpu
16: 9 8 IO-APIC-level ide2, ide3
18: 2059 2063 IO-APIC-level ide0
19: 1 2 IO-APIC-level Digital DS21041 Tulip
NMI: 0
IPI: 0

THEN

CPU0 CPU1
0: 135388 179 XT-PIC timer
1: 3054 1522 IO-APIC-edge keyboard
2: 0 0 XT-PIC cascade
3: 6625 2706 IO-APIC-edge serial
4: 7136 5115 IO-APIC-edge serial
5: 0 2 IO-APIC-edge soundblaster
8: 0 1 IO-APIC-edge rtc
12: 12633 7665 IO-APIC-edge PS/2 Mouse
13: 1 0 XT-PIC fpu
16: 9 10 IO-APIC-level ide2, ide3
18: 1687 1689 IO-APIC-level ide0
19: 2 1 IO-APIC-level Digital DS21041 Tulip

0: 135388 179 XT-PIC timer
^^^ is this the cause of..........
"spurious APIC interrupt, ayiee, should never happen."

Should we not change the logical destination mask from all CPUs to Local.

/* entry.dest.logical.logical_dest = 0xff; */ /* all CPUs */
entry.dest.logical.logical_dest = 0x0f; /* Local CPU */

ENABLING IO-APIC IRQs
init IO_APIC IRQs
IO-APIC pin 0, 9, 10, 11, 15, 20, 21, 22, 23 not connected.
..MP-BIOS bug: 8254 timer not connected to IO-APIC
..trying to set up timer as ExtINT ... .. (found pin 0) ...
..make_8259A_irq_now ... works.
nr of MP irq sources: 18.
nr of IO-APIC registers: 24.
testing the IO APIC.......................
.... register #00: 02000000
....... : physical APIC id: 02
.... register #01: 00170011
....... : max redirection entries: 0017
....... : IO APIC version: 0011
.... register #02: 04000000
....... : arbitration: 04
.... IRQ redirection table:
NR Log Phy Mask Trig IRR Pol Stat Dest Deli Vect:
00 00F 0F 0 0 0 0 0 1 7 51
01 0FF 0F 0 0 0 0 0 1 1 59
02 0FF 0F 0 0 0 0 0 1 1 51
03 0FF 0F 0 0 0 0 0 1 1 69

With the 0xff flags for all CPUs, and then force this

if (pin2 != -1) {
printk(".. (found pin %d) ...", pin2);
setup_ExtINT_pin (pin2);
make_8259A_irq(0);
}

from "__initfunc(static void check_timer (void))". As you see from above
that with the timer_irq set to a XT-PIC. So why do we mask for all CPUs?

Now the "FLAW" question..............

/* entry.dest.logical.logical_dest = 0xff; */ /* all CPUs */
/* entry.dest.logical.logical_dest = 0x0f; */ /* Local CPU */
entry.dest.logical.logical_dest = 0xf0; /* non-Local CPU */

ENABLING IO-APIC IRQs
init IO_APIC IRQs
IO-APIC pin 0, 9, 10, 11, 15, 20, 21, 22, 23 not connected.
..MP-BIOS bug: 8254 timer not connected to IO-APIC
..trying to set up timer as ExtINT ... .. (found pin 0) ... works.
nr of MP irq sources: 18.
nr of IO-APIC registers: 24.
testing the IO APIC.......................
.... register #00: 02000000
....... : physical APIC id: 02
.... register #01: 00170011
....... : max redirection entries: 0017
....... : IO APIC version: 0011
.... register #02: 04000000
....... : arbitration: 04
.... IRQ redirection table:
NR Log Phy Mask Trig IRR Pol Stat Dest Deli Vect:
00 0F0 00 0 0 0 0 0 1 7 51
01 0FF 0F 0 0 0 0 0 1 1 59
02 0FF 0F 0 0 0 0 0 1 1 51
03 0FF 0F 0 0 0 0 0 1 1 69

I did this also and got the same darn results. What am I missing?
Is the goal to flag all CPUs, because untied timer does really generate
an interrupt at IRQ0, too for the "MP-BIOS bug: 8254 timer not connected
to IO-APIC"???

If so what next?

Back to the USB IRQ255...........

./linux/arch/i386/kernel/bios32.c

__initfunc(void pcibios_fixup_devices(void))
{

/* BLAH BLAH */

#endif
/*
* Fix out-of-range IRQ numbers and report bogus IRQ.
*/
if (dev->irq >= NR_IRQS)
dev->irq = 0;
}
}

If this truly reports as 255 or FF for the interrupt value under SMP,
should we not let it live??

The questions will pour again soon........

Cheers from the Confused,
Andre

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu