[PATCH AUTOSEL 5.10 84/85] watchdog/softlockup: remove logic that tried to prevent repeated reports

From: Sasha Levin
Date: Wed May 05 2021 - 13:07:17 EST

From: Petr Mladek <pmladek@xxxxxxxx>

[ Upstream commit 1bc503cb4a2638fb1c57801a7796aca57845ce63 ]

The softlockup detector does some gymnastic with the variable
soft_watchdog_warn. It was added by the commit 58687acba59266735ad
("lockup_detector: Combine nmi_watchdog and softlockup detector").

The purpose is not completely clear. There are the following clues. They
describe the situation how it looked after the above mentioned commit:

1. The variable was checked with a comment "only warn once".

2. The variable was set when softlockup was reported. It was cleared
only when the CPU was not longer in the softlockup state.

3. watchdog_touch_ts was not explicitly updated when the softlockup
was reported. Without this variable, the report would normally
be printed again during every following watchdog_timer_fn()

The logic has got even more tangled up by the commit ed235875e2ca98
("kernel/watchdog.c: print traces for all cpus on lockup detection").
After this commit, soft_watchdog_warn is set only when
softlockup_all_cpu_backtrace is enabled. But multiple reports from all
CPUs are prevented by a new variable soft_lockup_nmi_warn.


The variable probably never worked as intended. In each case, it has not
worked last many years because the softlockup was reported repeatedly
after the full period defined by watchdog_thresh.

The reason is that watchdog gets touched in many known slow paths, for
example, in printk_stack_address(). This code is called also when
printing the softlockup report. It means that the watchdog timestamp gets
updated after each report.


Simply remove the logic. People want the periodic report anyway.

Link: https://lkml.kernel.org/r/20210311122130.6788-5-pmladek@xxxxxxxx
Signed-off-by: Petr Mladek <pmladek@xxxxxxxx>
Cc: Ingo Molnar <mingo@xxxxxxxxxx>
Cc: Laurence Oberman <loberman@xxxxxxxxxx>
Cc: Michal Hocko <mhocko@xxxxxxxx>
Cc: Peter Zijlstra <peterz@xxxxxxxxxxxxx>
Cc: Thomas Gleixner <tglx@xxxxxxxxxxxxx>
Cc: Vincent Whitchurch <vincent.whitchurch@xxxxxxxx>
Signed-off-by: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx>
Signed-off-by: Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx>
Signed-off-by: Sasha Levin <sashal@xxxxxxxxxx>
kernel/watchdog.c | 14 ++------------
1 file changed, 2 insertions(+), 12 deletions(-)

diff --git a/kernel/watchdog.c b/kernel/watchdog.c
index 7776d53a015c..122e272ad7f2 100644
--- a/kernel/watchdog.c
+++ b/kernel/watchdog.c
@@ -172,7 +172,6 @@ static u64 __read_mostly sample_period;
static DEFINE_PER_CPU(unsigned long, watchdog_touch_ts);
static DEFINE_PER_CPU(struct hrtimer, watchdog_hrtimer);
static DEFINE_PER_CPU(bool, softlockup_touch_sync);
-static DEFINE_PER_CPU(bool, soft_watchdog_warn);
static DEFINE_PER_CPU(unsigned long, hrtimer_interrupts);
static DEFINE_PER_CPU(unsigned long, hrtimer_interrupts_saved);
static unsigned long soft_lockup_nmi_warn;
@@ -394,19 +393,12 @@ static enum hrtimer_restart watchdog_timer_fn(struct hrtimer *hrtimer)
if (kvm_check_and_clear_guest_paused())

- /* only warn once */
- if (__this_cpu_read(soft_watchdog_warn) == true)
if (softlockup_all_cpu_backtrace) {
/* Prevent multiple soft-lockup reports if one cpu is already
* engaged in dumping cpu back traces
- if (test_and_set_bit(0, &soft_lockup_nmi_warn)) {
- /* Someone else will report us. Let's give up */
- __this_cpu_write(soft_watchdog_warn, true);
+ if (test_and_set_bit(0, &soft_lockup_nmi_warn))
- }

/* Start period for the next softlockup warning. */
@@ -436,9 +428,7 @@ static enum hrtimer_restart watchdog_timer_fn(struct hrtimer *hrtimer)
if (softlockup_panic)
panic("softlockup: hung tasks");
- __this_cpu_write(soft_watchdog_warn, true);
- } else
- __this_cpu_write(soft_watchdog_warn, false);
+ }