Re: [PATCH RFC] NMI Re-introduce un[set]_nmi_callback

From: Ingo Molnar
Date: Fri Sep 05 2008 - 05:33:56 EST



* Andi Kleen <andi@xxxxxxxxxxxxxx> wrote:

> > Add "kdump" to the list. It will also be broken if we decide to let
> > one driver hijack the NMI handler.
>
> kdump is a special case, similar to the NMI button panic mode. It
> should be always only active when the user configured it. When the
> user configured it should be always the fallback and override any
> other drivers.

if by 'any other drivers' you mean all other notifiers then that's wrong
- kdump must still come after many other NMI sources.

Basically, the sane order is this:

highest priority:

instruction patching callbacks. (kprobes, mmiotrace, kmemcheck) These
are abstractions that are essential for the kernel to properly
function/execute. We dont ever want them to be overriden.

high priority:

CPU-generated profiling callbacks. (nmi-lapic watchdog, performance
counter generated NMIs) These are 'good' NMI sources that can (well,
'could') identify themselves, and they can also come in very
frequently so we want to execute them early.

mid priority:

optional/interactive debug facilities. (kdump, KGDB, trace dump, NMI
button).
The user enables them optionally and wants them catch all non-expected
or interactive NMI events.

normal priority:

various platform drivers. Infrequent NMI sources. It's what we use to
make unexpected events slightly more informative when the user does
not configure any explicit debugging helper.

lowest priority:

fallback legacy platform handlers - 61H reads, etc.

All in one, the patch submitted here is wrong as a generic facility. One
valid aspect of the patch is that the port 61H reads (and other
architecture code chipset ops) should execute as a regular notifier and
with low priority.

as it does not really solve anything in a structured way, it allows
platform drivers to install a super-high priority notifier, creating
needless duplication and confusion. The exact reasons for the changes
should be listed instead and proper (and separate) patches done for each
reason, along the suggestions in this thread - i believe all issues were
covered in one way or another.

Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/