Re: [PATCH] x86 NMI: Be smarter about invoking panic() inside NMIhandler.
From: Don Zickus
Date: Tue Mar 27 2012 - 12:06:13 EST
On Tue, Mar 20, 2012 at 01:57:41PM -0400, Andrei E. Warkentin wrote:
> 2012/3/1 Andrei Warkentin <andrey.warkentin@xxxxxxxxx>:
> > If two (or more) unknown NMIs arrive on different CPUs, there
> > is a large chance both CPUs will wind up inside panic(). This
> > is fine, unless you want to enter KDB - KDB cannot round up
> > all CPUs, because some of them are stuck inside
> > panic_smp_self_stop with NMI latched. This is
> > easy to replicate with QEMU. Boot with -smp 4 and
> > send NMI using the monitor.
> > Solution for this - attempt to enter panic() from NMI
> > handler. If panic() is already active in the system,
> > just exit out of the NMI handler. This lets KDB round
> > up CPUs.
> > Signed-off-by: Andrei Warkentin <andrey.warkentin@xxxxxxxxx>
> > ---
> Any feedback on this? Who are the right maintainers to bug about this?
Hmm, if try_panic fails, then the cpu continues on executing code. This
might further corrupt an already broken system. So I don't think this
patch will work as is.
Perhaps instead of panic'ing in the NMI context, we use irq_work and panic
in an interrupt context instead. We still get the system to stop (though
it might still execute some interrupts) and it will be out of the NMI
However, you will still run into a similar problem when in the
panic/reboot case we shutdown all the remote cpus and have them sitting in
a similar cpu_relax loop in the NMI context, while the panic'ing cpu
cleans things up.
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/