[PATCH v8 00/10] arm64: Add framework to turn an IPI as NMI

From: Douglas Anderson
Date: Wed Apr 19 2023 - 18:57:07 EST


This is an attempt to resurrect Sumit's old patch series [1] that
allowed us to use the arm64 pseudo-NMI to get backtraces of CPUs and
also to round up CPUs in kdb/kgdb. The last post from Sumit that I
could find was v7, so I called this series v8. I haven't copied all of
his old changelongs here, but you can find them from the link.

Since v7, I have:
* Addressed the small amount of feedback that was there for v7.
* Rebased.
* Added a new patch that prevents us from spamming the logs with idle
tasks.
* Added an extra patch to gracefully fall back to regular IPIs if
pseudo-NMIs aren't there.

Since there appear to be a few different patches series related to
being able to use NMIs to get stack traces of crashed systems, let me
try to organize them to the best of my understanding:

a) This series. On its own, a) will (among other things) enable stack
traces of all running processes with the soft lockup detector if
you've enabled the sysctl "kernel.softlockup_all_cpu_backtrace". On
its own, a) doesn't give a hard lockup detector.

b) A different recently-posted series [2] that adds a hard lockup
detector based on perf. On its own, b) gives a stack crawl of the
locked up CPU but no stack crawls of other CPUs (even if they're
locked too). Together with a) + b) we get everything (full lockup
detect, full ability to get stack crawls).

c) The old Android "buddy" hard lockup detector [3] that I'm
considering trying to upstream. If b) lands then I believe c) would
be redundant (at least for arm64). c) on its own is really only
useful on arm64 for platforms that can print CPU_DBGPCSR somehow
(see [4]). a) + c) is roughly as good as a) + b).

[1] https://lore.kernel.org/linux-arm-kernel/1604317487-14543-1-git-send-email-sumit.garg@xxxxxxxxxx/
[2] https://lore.kernel.org/linux-arm-kernel/20220903093415.15850-1-lecopzer.chen@xxxxxxxxxxxx/
[3] https://issuetracker.google.com/172213097
[4] https://issuetracker.google.com/172213129

Changes in v8:
- dynamic_ipi_setup() and dynamic_ipi_teardown() no longer take cpu param
- dynamic_ipi_setup() and dynamic_ipi_teardown() no longer take cpu param
- Add loongarch support, too
- Removed "#ifdef CONFIG_SMP" since arm64 is always SMP
- "Tag the arm64 idle functions as __cpuidle" new for v8
- "Provide a stub kgdb_nmicallback() if !CONFIG_KGDB" new for v8
- "Fallback to a regular IPI if NMI isn't enabled" new for v8

Douglas Anderson (3):
arm64: idle: Tag the arm64 idle functions as __cpuidle
kgdb: Provide a stub kgdb_nmicallback() if !CONFIG_KGDB
arm64: ipi_nmi: Fallback to a regular IPI if NMI isn't enabled

Sumit Garg (7):
arm64: Add framework to turn IPI as NMI
irqchip/gic-v3: Enable support for SGIs to act as NMIs
arm64: smp: Assign and setup an IPI as NMI
nmi: backtrace: Allow runtime arch specific override
arm64: ipi_nmi: Add support for NMI backtrace
kgdb: Expose default CPUs roundup fallback mechanism
arm64: kgdb: Roundup cpus using IPI as NMI

arch/arm/include/asm/irq.h | 2 +-
arch/arm/kernel/smp.c | 3 +-
arch/arm64/include/asm/irq.h | 4 ++
arch/arm64/include/asm/nmi.h | 17 +++++
arch/arm64/kernel/Makefile | 2 +-
arch/arm64/kernel/idle.c | 4 +-
arch/arm64/kernel/ipi_nmi.c | 103 +++++++++++++++++++++++++++++++
arch/arm64/kernel/kgdb.c | 18 ++++++
arch/arm64/kernel/smp.c | 8 +++
arch/loongarch/include/asm/irq.h | 2 +-
arch/loongarch/kernel/process.c | 3 +-
arch/mips/include/asm/irq.h | 2 +-
arch/mips/kernel/process.c | 3 +-
arch/powerpc/include/asm/nmi.h | 2 +-
arch/powerpc/kernel/stacktrace.c | 3 +-
arch/sparc/include/asm/irq_64.h | 2 +-
arch/sparc/kernel/process_64.c | 4 +-
arch/x86/include/asm/irq.h | 2 +-
arch/x86/kernel/apic/hw_nmi.c | 3 +-
drivers/irqchip/irq-gic-v3.c | 29 ++++++---
include/linux/kgdb.h | 13 ++++
include/linux/nmi.h | 12 ++--
kernel/debug/debug_core.c | 8 ++-
23 files changed, 217 insertions(+), 32 deletions(-)
create mode 100644 arch/arm64/include/asm/nmi.h
create mode 100644 arch/arm64/kernel/ipi_nmi.c

--
2.40.0.634.g4ca3ef3211-goog