Re: [RFC PATCH] nmi,printk: fix ABBA deadlock between nmi_backtrace and dump_stack_lvl

From: Rik van Riel
Date: Wed Jul 17 2024 - 09:48:31 EST


On Wed, 2024-07-17 at 09:22 +0206, John Ogness wrote:
>
> The purpose of printk_cpu_sync_get_irqsave() is to avoid having the
> different backtraces being interleaved in the _ringbuffer_. It really
> isn't necessary that they are printed in that context. And indeed,
> there
> is no guarantee that they will be printed in that context anyway.
>
> Perhaps a simple solution would be for printk_cpu_sync_get_irqsave()
> to
> call printk_deferred_enter/_exit. Something like the below patch.
>

I think that would do the trick. The nmi_backtrace() printk is already
deferred, because of the check for in_nmi() in vprintk(), and this
change would put all the other users of printk_cpu_sync_get_irqsave()
on the exact same footing as nmi_backtrace().

Combing through the code a little, it looks like that would remove
the potential for this deadlock to happen again.
>
>
> diff --git a/include/linux/printk.h b/include/linux/printk.h
> index 65c5184470f1..1a6f5aac28bf 100644
> --- a/include/linux/printk.h
> +++ b/include/linux/printk.h
> @@ -315,8 +315,10 @@ extern void __printk_cpu_sync_put(void);
>  #define printk_cpu_sync_get_irqsave(flags) \
>   for (;;) { \
>   local_irq_save(flags); \
> + printk_deferred_enter(); \
>   if (__printk_cpu_sync_try_get()) \
>   break; \
> + printk_deferred_exit(); \
>   local_irq_restore(flags); \
>   __printk_cpu_sync_wait(); \
>   }
> @@ -329,6 +331,7 @@ extern void __printk_cpu_sync_put(void);
>  #define printk_cpu_sync_put_irqrestore(flags) \
>   do { \
>   __printk_cpu_sync_put(); \
> + printk_deferred_exit(); \
>   local_irq_restore(flags); \
>   } while (0)
>  
>

--
All Rights Reversed.