Re: [PATCH], issue EOI to APIC prior to calling crash_kexec indie_nmi path

From: Vivek Goyal
Date: Wed Feb 06 2008 - 18:52:23 EST


On Thu, Feb 07, 2008 at 12:36:57AM +0100, Ingo Molnar wrote:
>
> * H. Peter Anvin <hpa@xxxxxxxxx> wrote:
>
> >> I am wondering if interrupts are disabled on crashing cpu or if
> >> crashing cpu is inside die_nmi(), how would it stop/prevent delivery
> >> of NMI IPI to other cpus.
> >
> > I don't see how it would.
>
> cross-CPU IPIs are a bit fragile on some PC platforms. So if the kexec
> code relies on getting IPIs to all other CPUs, it might not be able to
> do it reliably. There might be limitations on how many APIC irqs there
> can be queued at a time, and if those slots are used up and the CPU is
> not servicing irqs then stuff gets retried. This might even affect NMIs
> sent via APIC messages - not sure about that.

- Kexec code does not wait infinitely for destination cpu to respond to
NMI. If destination cpu does not reposond in certain amount of time,
execution continues. So even if NMI was not delivered to destination
cpu kexec code should have continued. (Dangerous though, as we don't
know what other cpu will be doing in the mean time.)

- Even if there is a limitation on how many interrupts can be queued up
(including NMI), I am not sure how this patch will help that situation.
This patch is not doing anything on destination cpu (assuming destination
cpu is also not executing die_nmi() at the same time)

In fact, even if other cpus are servicing die_nmi() they will end up
spinning on kexec_lock inside crash_kexec().

I think there is more to this problem then just EOI stuff.

Thanks
Vivek
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/