Re: [PATCH v4] x86/nmi: Add an emergency handler in nmi_desc & use it in nmi_shootdown_cpus()

From: Thomas Gleixner
Date: Wed Feb 05 2025 - 04:21:00 EST


On Thu, Dec 19 2024 at 10:06, Waiman Long wrote:
> Depending on the type of panics, it was found that the
> __register_nmi_handler() function can be called in NMI context from
> nmi_shootdown_cpus() leading to a lockdep splat like the following.
>
> [ 1123.133573] ================================
> [ 1123.137845] WARNING: inconsistent lock state
> [ 1123.142118] 6.12.0-31.el10.x86_64+debug #1 Not tainted
> [ 1123.147257] --------------------------------
> [ 1123.151529] inconsistent {INITIAL USE} -> {IN-NMI} usage.
> :
> [ 1123.261544] Possible unsafe locking scenario:
> [ 1123.261544]
> [ 1123.267463] CPU0
> [ 1123.269915] ----
> [ 1123.272368] lock(&nmi_desc[0].lock);
> [ 1123.276122] <Interrupt>
> [ 1123.278746] lock(&nmi_desc[0].lock);
> [ 1123.282671]
> [ 1123.282671] *** DEADLOCK ***
> :
> [ 1123.314088] Call Trace:
> [ 1123.316542] <NMI>
> [ 1123.318562] dump_stack_lvl+0x6f/0xb0
> [ 1123.322230] print_usage_bug.part.0+0x3d3/0x610
> [ 1123.330618] lock_acquire.part.0+0x2e6/0x360
> [ 1123.357217] _raw_spin_lock_irqsave+0x46/0x90
> [ 1123.366193] __register_nmi_handler+0x8f/0x3a0
> [ 1123.374401] nmi_shootdown_cpus+0x95/0x120
> [ 1123.378509] kdump_nmi_shootdown_cpus+0x15/0x20
> [ 1123.383040] native_machine_crash_shutdown+0x54/0x160
> [ 1123.388095] __crash_kexec+0x10f/0x1f0
> [ 1123.421465] ? __ghes_panic.cold+0x4f/0x5d
> [ 1123.482648] </NMI>

Can you please trim backtraces properly according to documentation:

https://www.kernel.org/doc/html/latest/process/submitting-patches.html#backtraces-in-commit-messages

> In this particular case, the following panic message was printed before.
> One way to address this problem is to remove all the panic() calls from
> NMI context, but that can be too restrictive.
>
> Another way to fix this problem while allowing panic() calls from

Fix it by doing ....

> NMI context is by adding a new emergency NMI handler to the nmi_desc
> structure and provide a new set_emergency_nmi_handler() helper to
> atomically set crash_nmi_callback() in any context. The new emergency
> handler will preempt other handlers in the linked list. That will
> eliminate the need to take any lock and serve the panic in NMI use case.
> +/*
> + * An emergency handler can be set in any context including NMI
> + */
> struct nmi_desc {
> raw_spinlock_t lock;
> + nmi_handler_t emerg_handler; /* Emergency handler */

This tail comment is not only useless

> +/**
> + * set_emergency_nmi_handler - Set emergency handler
> + * @type: NMI type
> + * @handler: the emergency handler to be stored
> + * Return: 0 if success, -EEXIST if a handler had been stored
> + *
> + * Atomically set an emergency NMI handler which, if set, will preempt all
> + * the other handlers in the linked list. If a NULL handler is passed in,
> + * it will clear it.
> + */
> +int set_emergency_nmi_handler(unsigned int type, nmi_handler_t handler)
> +{
> + struct nmi_desc *desc = nmi_to_desc(type);
> + nmi_handler_t orig = NULL;
> +
> + if (!handler) {
> + orig = READ_ONCE(desc->emerg_handler);
> + WARN_ON_ONCE(!orig);
> + }
> +
> + if (try_cmpxchg(&desc->emerg_handler, &orig, handler))
> + return 0;

What's the point of this cmpxchg()? What's the concurrency problem this
tries to address?

> + if (WARN_ON_ONCE(orig == handler))
> + return 0;
> + WARN_ONCE(1, "%s: failed to set emergency NMI handler!\n", __func__);
> + return -EEXIST;

These return values are there to be ignored at the only call site. So
what's the point of having them in the first place?

Thanks,

tglx