[PATCH v2 2/4] printk: disable optimistic spin during panic

From: Stephen Brennan
Date: Wed Jan 26 2022 - 18:03:17 EST


A CPU executing with console lock spinning enabled might be halted
during a panic. Before the panicking CPU calls console_flush_on_panic(),
it may call console_trylock(), which attempts to optimistically spin,
deadlocking the panic CPU:

CPU 0 (panic CPU) CPU 1
----------------- ------
printk() {
vprintk_func() {
vprintk_default() {
vprintk_emit() {
console_unlock() {
console_lock_spinning_enable();
... printing to console ...
panic() {
crash_smp_send_stop() {
NMI -------------------> HALT
}
atomic_notifier_call_chain() {
printk() {
...
console_trylock_spinnning() {
// optimistic spin infinitely

This hang during panic can be induced when a kdump kernel is loaded, and
crash_kexec_post_notifiers=1 is present on the kernel command line. The
following script which concurrently writes to /dev/kmsg, and triggers a
panic, can result in this hang:

#!/bin/bash
date
# 991 chars (based on log buffer size):
chars="$(printf 'a%.0s' {1..991})"
while :; do
echo $chars > /dev/kmsg
done &
echo c > /proc/sysrq-trigger &
date
exit

To avoid this deadlock, ensure that console_trylock_spinning() does not
allow spinning once a panic has begun.

Fixes: dbdda842fe96 ("printk: Add console owner and waiter logic to load balance console writes")

Suggested-by: Petr Mladek <pmladek@xxxxxxxx>
Signed-off-by: Stephen Brennan <stephen.s.brennan@xxxxxxxxxx>
Reviewed-by: Petr Mladek <pmladek@xxxxxxxx>
---
kernel/printk/printk.c | 10 ++++++++++
1 file changed, 10 insertions(+)

diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
index 57b132b658e1..20b4b71a1a07 100644
--- a/kernel/printk/printk.c
+++ b/kernel/printk/printk.c
@@ -1843,6 +1843,16 @@ static int console_trylock_spinning(void)
if (console_trylock())
return 1;

+ /*
+ * It's unsafe to spin once a panic has begun. If we are the
+ * panic CPU, we may have already halted the owner of the
+ * console_sem. If we are not the panic CPU, then we should
+ * avoid taking console_sem, so the panic CPU has a better
+ * chance of cleanly acquiring it later.
+ */
+ if (panic_in_progress())
+ return 0;
+
printk_safe_enter_irqsave(flags);

raw_spin_lock(&console_owner_lock);
--
2.30.2