Re: [PATCH v4] x86/nmi: Add an emergency handler in nmi_desc & use it in nmi_shootdown_cpus()

From: Waiman Long
Date: Wed Feb 05 2025 - 21:47:14 EST


On 2/5/25 4:20 AM, Thomas Gleixner wrote:
On Thu, Dec 19 2024 at 10:06, Waiman Long wrote:
Depending on the type of panics, it was found that the
__register_nmi_handler() function can be called in NMI context from
nmi_shootdown_cpus() leading to a lockdep splat like the following.

[ 1123.133573] ================================
[ 1123.137845] WARNING: inconsistent lock state
[ 1123.142118] 6.12.0-31.el10.x86_64+debug #1 Not tainted
[ 1123.147257] --------------------------------
[ 1123.151529] inconsistent {INITIAL USE} -> {IN-NMI} usage.
:
[ 1123.261544] Possible unsafe locking scenario:
[ 1123.261544]
[ 1123.267463] CPU0
[ 1123.269915] ----
[ 1123.272368] lock(&nmi_desc[0].lock);
[ 1123.276122] <Interrupt>
[ 1123.278746] lock(&nmi_desc[0].lock);
[ 1123.282671]
[ 1123.282671] *** DEADLOCK ***
:
[ 1123.314088] Call Trace:
[ 1123.316542] <NMI>
[ 1123.318562] dump_stack_lvl+0x6f/0xb0
[ 1123.322230] print_usage_bug.part.0+0x3d3/0x610
[ 1123.330618] lock_acquire.part.0+0x2e6/0x360
[ 1123.357217] _raw_spin_lock_irqsave+0x46/0x90
[ 1123.366193] __register_nmi_handler+0x8f/0x3a0
[ 1123.374401] nmi_shootdown_cpus+0x95/0x120
[ 1123.378509] kdump_nmi_shootdown_cpus+0x15/0x20
[ 1123.383040] native_machine_crash_shutdown+0x54/0x160
[ 1123.388095] __crash_kexec+0x10f/0x1f0
[ 1123.421465] ? __ghes_panic.cold+0x4f/0x5d
[ 1123.482648] </NMI>
Can you please trim backtraces properly according to documentation:

https://www.kernel.org/doc/html/latest/process/submitting-patches.html#backtraces-in-commit-messages

Thanks for the pointer, will trim down the backtrace.

In this particular case, the following panic message was printed before.
One way to address this problem is to remove all the panic() calls from
NMI context, but that can be too restrictive.

Another way to fix this problem while allowing panic() calls from
Fix it by doing ....

NMI context is by adding a new emergency NMI handler to the nmi_desc
structure and provide a new set_emergency_nmi_handler() helper to
atomically set crash_nmi_callback() in any context. The new emergency
handler will preempt other handlers in the linked list. That will
eliminate the need to take any lock and serve the panic in NMI use case.
+/*
+ * An emergency handler can be set in any context including NMI
+ */
struct nmi_desc {
raw_spinlock_t lock;
+ nmi_handler_t emerg_handler; /* Emergency handler */
This tail comment is not only useless

+/**
+ * set_emergency_nmi_handler - Set emergency handler
+ * @type: NMI type
+ * @handler: the emergency handler to be stored
+ * Return: 0 if success, -EEXIST if a handler had been stored
+ *
+ * Atomically set an emergency NMI handler which, if set, will preempt all
+ * the other handlers in the linked list. If a NULL handler is passed in,
+ * it will clear it.
+ */
+int set_emergency_nmi_handler(unsigned int type, nmi_handler_t handler)
+{
+ struct nmi_desc *desc = nmi_to_desc(type);
+ nmi_handler_t orig = NULL;
+
+ if (!handler) {
+ orig = READ_ONCE(desc->emerg_handler);
+ WARN_ON_ONCE(!orig);
+ }
+
+ if (try_cmpxchg(&desc->emerg_handler, &orig, handler))
+ return 0;
What's the point of this cmpxchg()? What's the concurrency problem this
tries to address?
It is because I am not sure if there can only be one instance of nmi_shootdown_cpus() at any time. If there can't be more than one instance, I can remove the atomic instruction. I do remove the smp_wmb() in nmi_shootdown_cpus(). If I don't use an atomic instruction, I will have to add it smp_wmb() here.

+ if (WARN_ON_ONCE(orig == handler))
+ return 0;
+ WARN_ONCE(1, "%s: failed to set emergency NMI handler!\n", __func__);
+ return -EEXIST;
These return values are there to be ignored at the only call site. So
what's the point of having them in the first place?

The first version of the patch did check the return value. I will change it to a void function.

Thanks for the review.

Cheers,
Longman