Re: nmi_watchdog fix for x86_64 to be more like i386

From: Andi Kleen
Date: Tue Oct 02 2007 - 01:51:57 EST



> Agreed.
>
> I just got a x8664-hrt report, where I found the following oddity:
>
> 0: 1197 172881 IO-APIC-edge timer
>
> That's one of those infamous AMD C1E boxen. Strange, all my systems have
> IRQ#0 on CPU#0 and nowhere else. Any idea ?

Hmm, in lowestpriority mode it would be possible that the APIC changes
the CPU to #1 once; but IRQ 0 is always set to fixed mode. Also even
if that happens you should have them all on 1.

Maybe the chipset is just ignoring the IO-APIC configuration in this case?

Is it always the same chipset? Is it seen on i386 too?

The problem is really that if this happens it's more than the NMI watchdog
that is broken. If you don't run an additional APIC timer interrupt on CPU #0
it's possible that CPU #0 won't schedule at all.

The only workaround for chipsets ignoring IRQ affinity would be to keep
track on which CPU irq 0 happens and then restart APIC timer interrupts
on the others (or send IPIs) as needed. But that would be fairly ugly.

IRQ 0 is unfortunately an generally under-validated area because Windows doesn't
use it.

-Andi

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/