Re: [RFC PATCH] nmi,printk: fix ABBA deadlock between nmi_backtrace and dump_stack_lvl

From: Petr Mladek
Date: Wed Jul 24 2024 - 08:46:07 EST


On Wed 2024-07-17 09:22:21, John Ogness wrote:
> On 2024-07-15, Rik van Riel <riel@xxxxxxxxxxx> wrote:
> > Both nmi_backtrace and dump_stack_lvl call printk_cpu_sync_get_irqsave.
> >
> > However, dump_stack_lvl will call into the printk code, while holding
> > the printk_cpu_sync_get lock, and then take the console lock.
> >
> > Another CPU may end up getting an NMI stack trace printed, after
> > being stuck printing something to serial console for too long,
> > with the console lock held.
> >
> > This results in the following lock order:
> > CPU A: printk_cpu_sync_get lock -> console_lock
> > CPU B: console_lock -> (nmi) printk_cpu_sync_get lock
> >
> > This will cause the system to hang with an ABBA deadlock
>
> The console lock is acquired via trylock, so that will not yield
> deadlock here. However, if CPU B was printing, then CPU A will spin on
> @console_waiter (in console_trylock_spinning()). _That_ is a deadlock.
>
> The purpose of printk_cpu_sync_get_irqsave() is to avoid having the
> different backtraces being interleaved in the _ringbuffer_. It really
> isn't necessary that they are printed in that context. And indeed, there
> is no guarantee that they will be printed in that context anyway.

Well, backtraces are printed when unexpected things happen. It is
usually an emergency situation and we should do our best to
flush consoles ASAP.

> Perhaps a simple solution would be for printk_cpu_sync_get_irqsave() to
> call printk_deferred_enter/_exit. Something like the below patch.
>
> John Ogness
>
> diff --git a/include/linux/printk.h b/include/linux/printk.h
> index 65c5184470f1..1a6f5aac28bf 100644
> --- a/include/linux/printk.h
> +++ b/include/linux/printk.h
> @@ -315,8 +315,10 @@ extern void __printk_cpu_sync_put(void);
> #define printk_cpu_sync_get_irqsave(flags) \
> for (;;) { \
> local_irq_save(flags); \
> + printk_deferred_enter(); \
> if (__printk_cpu_sync_try_get()) \
> break; \
> + printk_deferred_exit(); \
> local_irq_restore(flags); \
> __printk_cpu_sync_wait(); \
> }
> @@ -329,6 +331,7 @@ extern void __printk_cpu_sync_put(void);
> #define printk_cpu_sync_put_irqrestore(flags) \
> do { \
> __printk_cpu_sync_put(); \
> + printk_deferred_exit(); \
> local_irq_restore(flags); \
> } while (0)
>

OK, there is the basic rule: "Never take a lock in NMI when the lock might
be taken in another context". printk_cpu_sync_get() violates the rule.
But it is safe as long as the lock is re-entrant and tail.

The above patch fixes a situation where printk_cpu_sync_get() was not
a tail lock. So it looks like a reasonable fix if we want to keep it simple.

Well, I still prefer the alternative solution at
https://lore.kernel.org/r/93155b2ccafa43ed4845ae450451c6b8e27509cc.camel@xxxxxxxxxxx
which does not deffer the console output in normal/IRQ context. There
are doubts whether it is safe. I think that it is. Let me to reply there.

Best Regards,
Petr