On Thu, Dec 05 2024 at 08:22, Waiman Long wrote:
On 12/5/24 8:12 AM, Thomas Gleixner wrote:I understand that, but in case that the crashed CPU receives an NMI and
According to crash_nmi_callback(),Actually, crash_nmi_callback() can return in the case of the crashingWhy would you continue servicing the NMI on a CPU which just crashed?
CPUs, though all the other CPUs will not return once called. So I
believe the current form is correct. I will update the comment to
reflect that.
/*
* Don't do anything if this handler is invoked on crashing cpu.
* Otherwise, system will completely hang. Crashing cpu can get
* an NMI if system was initially booted with nmi_watchdog
parameter.
*/
if (cpu == crashing_cpu)
return NMI_HANDLED;
The crashing CPU still has work to do after shutting down other CPUs. It
can't wait there forever without completing other crashing actions. The
only thing I can see we can do is to return immediately without
servicing other less important nmi handlers in the list.
sees that the emergency handler is set, shouldn't it stop the NMI
processing instead of trying to go through perf and what not when the
system is already in a fragile state. i.e.:
if (emergemcy_handler) {
emergency_handler();
return;
}