Re: [RFC PATCH] x86 NMI-safe INT3 and Page Fault (v5)
From: Mathieu Desnoyers
Date: Thu Apr 17 2008 - 17:16:41 EST
* Andrew Morton (akpm@xxxxxxxxxxxxxxxxxxxx) wrote:
> On Thu, 17 Apr 2008 16:14:10 -0400
> Mathieu Desnoyers <compudj@xxxxxxxxxxxxxxxxxx> wrote:
>
> > +#define nmi_enter() \
> > + do { \
> > + lockdep_off(); \
> > + BUG_ON(hardnmi_count()); \
> > + add_preempt_count(HARDNMI_OFFSET); \
> > + __irq_enter(); \
> > + } while (0)
>
> <did it _have_ to be a macro?>
>
isn't this real macro art work ? ;) I kept the same coding style that
was already there, which mimics the irq_enter/irq_exit macros. Changing
all of them at once could be done in a separate patch.
> Doing BUG() inside an NMI should be OK most of the time. But the
> BUG-handling code does want to know if we're in interrupt context - at
> least for the "fatal exception in interrupt" stuff, and probably other
> things.
>
> But afacit the failure to include HARDNMI_MASK in
>
> #define irq_count() (preempt_count() & (HARDIRQ_MASK | SOFTIRQ_MASK))
>
> will prevent that.
>
> So.
>
> Should we or should we not make in_interrupt() return true in NMI?
> "should", I expect.
>
> If not, we'd need to do something else to communicate the current
> processing state down to the BUG-handling code.
>
You bring an interesting question. In practice, since this BUG_ON could
only happen if we have an NMI nested over another NMI or an nmi which
fails to decrement its HARDNMI_MASK. Given that the HARDIRQ_MASK is
incremented right after the HARDNMI_MASK increment (the reverse is also
true), really bad things (TM) must have happened for the BUG_ON to be
triggered outside of the __irq_enter()/__irq_exit() scope of the NMI
below the buggy one.
But since this code is there to extract as much information as possible
when things go wrong, I would say it's safer to, at least, add
HARDNMI_MASK to irq_count().
Instead, though, I think we could add :
if (in_nmi())
panic("Fatal exception in non-maskable interrupt");
to die(). That would be clearer. I just added it to x86_32, but can't
find where x86_64 reports the "fatal exception in interrupt" and friends
message. Any idea ?
By dealing with this case specifically, I think we don't really have to
add HARDNMI_MASK to irq_count(), considering it's normally an HARDIRQ
too.
Mathieu
--
Mathieu Desnoyers
Computer Engineering Ph.D. Student, Ecole Polytechnique de Montreal
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/