[PATCH v2 0/6] Add support for NMI source reporting

From: Jacob Pan
Date: Tue Jun 11 2024 - 12:49:17 EST


Hi Thomas and all,

(resend cover letter)

Non-Maskable Interrupts (NMIs) are routed to the local Advanced Programmable
Interrupt Controller (APIC) using vector #2. Before the advent of the
Flexible Return and Event Delivery (FRED)[1], the vector information set by
the NMI initiator was disregarded or lost within the hardware, compelling
system software to poll every registered NMI handler to pinpoint the source
of the NMI[2]. This approach led to several issues:

1. Inefficiency due to the CPU's time spent polling all handlers.
2. Increased latency from the additional time taken to poll all handlers.
3. The occurrence of unnecessary NMIs if they are triggered shortly
after being processed by a different source.

To tackle these challenges, Intel introduced NMI source reporting as a part
of the FRED specification (detailed in Chapter 9). This CPU feature ensures
that while all NMI sources are still aggregated into NMI vector (#2) for
delivery, the source of the NMI is now conveyed through FRED event data
(a 16-bit bitmap on the stack). This allows for the selective dispatch
of the NMI source handler based on the bitmap, eliminating the need to
invoke all NMI source handlers indiscriminately.

In line with the hardware architecture, various interrupt sources can
generate NMIs by encoding an NMI delivery mode. However, this patchset
activates only the local NMI sources that are currently utilized by the
Linux kernel, which includes:

1. Performance monitoring.
2. Inter-Processor Interrupts (IPIs) for functions like CPU backtrace,
machine check, Kernel GNU Debugger (KGDB), reboot, panic stop, and
self-test.

Other NMI sources will continue to be handled as previously when the NMI
source is not utilized or remains unidentified.

Next steps:
1. KVM support
2. Optimization to reuse IDT NMI vector 2 as NMI source for "known" source.
Link:https://lore.kernel.org/lkml/746fecd5-4c79-42f9-919e-912ec415e73f@xxxxxxxxx/


[1] https://www.intel.com/content/www/us/en/content-details/779982/flexible-return-and-event-delivery-fred-specification.html
[2] https://lore.kernel.org/lkml/171011362209.2468526.15187874627966416701.tglx@xen13/


Thanks,

Jacob

---
Change logs are in individual patches.

Jacob Pan (6):
x86/irq: Add enumeration of NMI source reporting CPU feature
x86/irq: Extend NMI handler registration interface to include source
x86/irq: Factor out common NMI handling code
x86/irq: Process nmi sources in NMI handler
perf/x86: Enable NMI source reporting for perfmon
x86/irq: Enable NMI source on IPIs delivered as NMI

arch/x86/Kconfig | 9 +++
arch/x86/events/amd/ibs.c | 2 +-
arch/x86/events/core.c | 11 ++-
arch/x86/events/intel/core.c | 6 +-
arch/x86/include/asm/apic.h | 1 +
arch/x86/include/asm/cpufeatures.h | 1 +
arch/x86/include/asm/disabled-features.h | 8 +-
arch/x86/include/asm/irq_vectors.h | 38 +++++++++
arch/x86/include/asm/nmi.h | 4 +-
arch/x86/kernel/apic/hw_nmi.c | 5 +-
arch/x86/kernel/apic/ipi.c | 4 +-
arch/x86/kernel/apic/local.h | 18 +++--
arch/x86/kernel/cpu/mce/inject.c | 4 +-
arch/x86/kernel/cpu/mshyperv.c | 2 +-
arch/x86/kernel/kgdb.c | 6 +-
arch/x86/kernel/nmi.c | 99 +++++++++++++++++++++---
arch/x86/kernel/nmi_selftest.c | 7 +-
arch/x86/kernel/reboot.c | 4 +-
arch/x86/kernel/smp.c | 4 +-
arch/x86/kernel/traps.c | 4 +-
arch/x86/platform/uv/uv_nmi.c | 4 +-
drivers/acpi/apei/ghes.c | 2 +-
drivers/char/ipmi/ipmi_watchdog.c | 2 +-
drivers/edac/igen6_edac.c | 2 +-
drivers/watchdog/hpwdt.c | 6 +-
25 files changed, 200 insertions(+), 53 deletions(-)

--
2.25.1