From: Kerstin Jonsson<kerstin.jonsson@xxxxxxxxxxxx>
When the SMP kernel decides to crash_kexec() the local APICs may have
pending interrupts in their vector tables.
The setup routine for the local APIC has a deficient mechanism for
clearing these interrupts, it only handles interrupts that has already
been dispatched to the local core for servicing (the ISR register)
safely, it doesn't consider lower prioritized queued interrupts stored
in the IRR register.
If you have more than one pending interrupt within the same 32 bit word
in the LAPIC vector table registers you may find yourself entering the
IO APIC setup with pending interrupts left in the LAPIC. This is a
situation for wich the IO APIC setup is not prepared. Depending of
what/which interrupt vector/vectors are stuck in the APIC tables your
system may show various degrees of malfunctioning.
That was the reason why the check_timer() failed in our system, the
timer interrupts was blocked by pending interrupts from the old kernel
when routed trough the IO APIC.
Additional comment from Jiri Bohac:
==============
If this should go into stable release,
I'd add some kind of limit on the number of iterations, just to be safe from
hard to debug lock-ups:
+if (loops++> MAX_LOOPS) {
+ printk("LAPIC pending clean-up")
+ break;
+}
while (queued);
with MAX_LOOPS something like 1E9 this would leave plenty of time for the
pending IRQs to be cleared and would and still cause at most a second of delay
if the loop were to lock-up for whatever reason.
==============
From trenn@xxxxxxx:
V2: Use tsc if avail to bail out after 1 sec due to possible virtual apic_read
calls which may take rather long (suggested by: Avi Kivity<avi@xxxxxxxxxx>)
If no tsc is available bail out quickly after cpu_khz, if we broke out too
early and still have irqs pending (which should never happen?) we still
get a WARN_ON...
@@ -1151,8 +1152,12 @@ static void __cpuinit lapic_setup_esr(void)
*/
void __cpuinit setup_local_APIC(void)
{
- unsigned int value;
- int i, j;
+ unsigned int value, queued;
+ int i, j, acked = 0;
+ unsigned long long tsc = 0, ntsc, max_loops = cpu_khz;
+
+ if (cpu_has_tsc)
+ rdtscll(ntsc);
+ if (cpu_has_tsc) {
+ rdtscll(ntsc);
+ max_loops = (cpu_khz<< 10) - (ntsc - tsc);
+ } else
+ max_loops--;
+ } while (queued&& max_loops> 0);
+ WARN_ON(!max_loops);