Re: [PATCH 0/2] Disabling NMI watchdog during LPM's memory transfer

From: Michael Ellerman
Date: Thu Jun 09 2022 - 03:46:35 EST


Nathan Lynch <nathanl@xxxxxxxxxxxxx> writes:
> Laurent Dufour <ldufour@xxxxxxxxxxxxx> writes:
...
>
>> There are ongoing investigations to clarify where and how this latency is
>> happening. I'm not excluding any other issue in the Linux kernel, but right
>> now, this looks to be the best option to prevent system crash during
>> LPM.
>
> It will prevent the likely crash mode for enterprise distros with
> default watchdog tunables that our internal test environments happen to
> use. But if someone were to run the same scenario with softlockup_panic
> enabled, or with the RCU stall timeout lower than the watchdog
> threshold, the failure mode would be different.
>
> Basically I'm saying:
> * Some users may actually want the OS to panic when it's in this state,
> because their applications can't work correctly.
> * But if we're going to inhibit one watchdog, we should inhibit them
> all.

I'm sympathetic to both of your arguments.

But I think there is a key difference between the NMI watchdog and other
watchdogs, which is that the NMI watchdog will use the unsafe NMI to
interrupt other CPUs, and that can cause the system to crash when other
watchdogs would just print a backtrace.

We had the same problem with the rcu_sched stall detector until we
changed it to use the "safe" NMI, see:
5cc05910f26e ("powerpc/64s: Wire up arch_trigger_cpumask_backtrace()")


So even if the NMI watchdog is disabled there are still the other
watchdogs enabled, which should print backtraces by default, and if
desired can also be configured to cause a panic.

Instead of disabling the NMI watchdog, can we instead increase the
timeout (by how much?) during LPM, so that it is less likely to fire in
normal usage, but is still there as a backup if the system is completely
clogged.

cheers