Re: [PATCH] x86/nmi: Use trylock in __register_nmi_handler() when in_nmi()

From: Waiman Long
Date: Thu Nov 28 2024 - 20:06:38 EST



On 11/28/24 4:28 AM, Peter Zijlstra wrote:
On Wed, Nov 27, 2024 at 06:34:55PM -0500, Waiman Long wrote:
The __register_nmi_handler() function can be called in NMI context from
nmi_shootdown_cpus() leading to a lockdep splat like the following.
This seems fundamentally insane. Why are we okay with this?

According to the functional comment of nmi_shootdown_cpus(),

 * nmi_shootdown_cpus() can only be invoked once. After the first
 * invocation all other CPUs are stuck in crash_nmi_callback() and
 * cannot respond to a second NMI.

That is why it has to insert the crash_nmi_callback() call with register_nmi_handler() here in the NMI context. Changing this will require a fundamental redesign of the way this shutdown process need to be handled and I am not knowledgeable enough to do that. I will certainly appreciate idea to handle it in a more graceful way.

Cheers,
Longman