Re: [RFC PATCH] panic: fix deadlock in panic()

From: chengjian (D)
Date: Fri Jun 05 2020 - 06:43:09 EST


Hi, Petr

On 2020/6/4 16:29, Petr Mladek wrote:

It might cause double unlock (deadlock) on architectures that did not
use NMI to stop the CPUs.

I have created a conservative fix for this problem for SLES, see
https://github.com/openSUSE/kernel-source/blob/SLE15-SP2-UPDATE/patches.suse/printk-panic-Avoid-deadlock-in-printk-after-stopping-CPUs-by-NMI.patch
It solves the problem only on x86 architecture.

There are many hacks that try to solve various scenarios but it
is getting too complicated and does not solve all problems.

I have read your conservative fix and I have some question,

1-- does the console_sem need to be reinitialized ?

2-- Other architectures without NMI, is there no such problem ?

The only real solution is lockless printk(). First piece is a lockless
ringbuffer. See the last version at
https://lore.kernel.org/r/20200501094010.17694-1-john.ogness@xxxxxxxxxxxxx

We prefer to work on the lockless solution instead of adding more
complicated workarounds. This is why I even did not try to upstream
the patch for SLES.

In the meantime, you might also consider removing the offending
message from the panic notifier if it is not really important.

Best Regards,
Petr

.

Thank you.

ÂÂÂ Cheng Jian