[PATCH] timer_list: avoid other cpu soft lockup when printing timer list

From: Yang Yingliang
Date: Wed Feb 19 2020 - 22:17:02 EST


If system has many cpus (e.g. 128), it will spend a lot of time to
print message to the console when execute echo q > /proc/sysrq-trigger.

When /proc/sys/kernel/numa_balancing is enabled, if the migration threads
are woke up, the migration thread that on print mesasage cpu can't run
until the print finish, another migration thread may trigger soft lockup.

PID: 619 TASK: ffffa02fdd8bec80 CPU: 121 COMMAND: "migration/121"
#0 [ffff00000a103b10] __crash_kexec at ffff0000081bf200
#1 [ffff00000a103ca0] panic at ffff0000080ec93c
#2 [ffff00000a103d80] watchdog_timer_fn at ffff0000081f8a14
#3 [ffff00000a103e00] __run_hrtimer at ffff00000819701c
#4 [ffff00000a103e40] __hrtimer_run_queues at ffff000008197420
#5 [ffff00000a103ea0] hrtimer_interrupt at ffff00000819831c
#6 [ffff00000a103f10] arch_timer_dying_cpu at ffff000008b53144
#7 [ffff00000a103f30] handle_percpu_devid_irq at ffff000008174e34
#8 [ffff00000a103f70] generic_handle_irq at ffff00000816c5e8
#9 [ffff00000a103f90] __handle_domain_irq at ffff00000816d1f4
#10 [ffff00000a103fd0] gic_handle_irq at ffff000008081860
--- <IRQ stack> ---
#11 [ffff00000d6e3d50] el1_irq at ffff0000080834c8
#12 [ffff00000d6e3d60] multi_cpu_stop at ffff0000081d9964
#13 [ffff00000d6e3db0] cpu_stopper_thread at ffff0000081d9cfc
#14 [ffff00000d6e3e10] smpboot_thread_fn at ffff00000811e0a8
#15 [ffff00000d6e3e70] kthread at ffff000008118988

To avoid this soft lockup, add touch_all_softlockup_watchdogs()
in sysrq_timer_list_show()

Signed-off-by: Yang Yingliang <yangyingliang@xxxxxxxxxx>
---
kernel/time/timer_list.c | 8 ++++++--
1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/kernel/time/timer_list.c b/kernel/time/timer_list.c
index acb326f..4cb0e6f 100644
--- a/kernel/time/timer_list.c
+++ b/kernel/time/timer_list.c
@@ -289,13 +289,17 @@ void sysrq_timer_list_show(void)

timer_list_header(NULL, now);

- for_each_online_cpu(cpu)
+ for_each_online_cpu(cpu) {
+ touch_all_softlockup_watchdogs();
print_cpu(NULL, cpu, now);
+ }

#ifdef CONFIG_GENERIC_CLOCKEVENTS
timer_list_show_tickdevices_header(NULL);
- for_each_online_cpu(cpu)
+ for_each_online_cpu(cpu) {
+ touch_all_softlockup_watchdogs();
print_tickdevice(NULL, tick_get_device(cpu), cpu);
+ }
#endif
return;
}
--
1.8.3