Re: [PATCH], issue EOI to APIC prior to calling crash_kexec indie_nmi path

From: Neil Horman
Date: Fri Feb 08 2008 - 11:16:18 EST


On Thu, Feb 07, 2008 at 01:24:04PM +0100, Ingo Molnar wrote:
>
> * Neil Horman <nhorman@xxxxxxxxxxxxx> wrote:
>
> > Ingo noted a few posts down the nmi_exit doesn't actually write to the
> > APIC EOI register, so yeah, I agree, its bogus (and I apologize, I
> > should have checked that more carefully). Nevertheless, this patch
> > consistently allowed a hangning machine to boot through an Nmi lockup.
> > So I'm forced to wonder whats going on then that this patch helps
> > with. perhaps its a just a very fragile timing issue, I'll need to
> > look more closely.
>
> try a dummy iret, something like:
>
> asm volatile ("pushf; push $1f; iret; 1: \n");
>
> to get the CPU out of its 'nested NMI' state. (totally untested)
>
> the idea is to push down an iret frame to the kernel stack that will
> just jump to the next instruction and gets it out of the NMI nesting.
> Note: interrupts will/must still be disabled, despite the iret. (the
> ordering of the pushes might be wrong, we might need more than that for
> a valid iret, etc. etc.)
>
> Ingo

Just tried this experiment and it met with success. Executing a dummy iret
instruction got us to boot the kdump kernel successfully.

Thoughts on how we should handle this from here?


Regards
Neil

--
/****************************************************
* Neil Horman <nhorman@xxxxxxxxxxxxx>
* Software Engineer, Red Hat
****************************************************/
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/