Re: [External] Re: [PATCH v2] printk: fix deadlock when kernel panic

From: Muchun Song
Date: Tue Feb 09 2021 - 07:21:59 EST


On Tue, Feb 9, 2021 at 5:19 PM Petr Mladek <pmladek@xxxxxxxx> wrote:
>
> On Sat 2021-02-06 13:41:24, Muchun Song wrote:
> > We found a deadlock bug on our server when the kernel panic. It can be
> > described in the following diagram.
> >
> > CPU0: CPU1:
> > panic rcu_dump_cpu_stacks
> > kdump_nmi_shootdown_cpus nmi_trigger_cpumask_backtrace
> > register_nmi_handler(crash_nmi_callback) printk_safe_flush
> > __printk_safe_flush
> > raw_spin_lock_irqsave(&read_lock)
> > // send NMI to other processors
> > apic_send_IPI_allbutself(NMI_VECTOR)
> > // NMI interrupt, dead loop
> > crash_nmi_callback
> > printk_safe_flush_on_panic
> > printk_safe_flush
> > __printk_safe_flush
> > // deadlock
> > raw_spin_lock_irqsave(&read_lock)
> >
> > The register_nmi_handler() can be called in the __crash_kexec() or the
> > crash_smp_send_stop() on the x86-64. Because CPU1 is interrupted by the
> > NMI with holding the read_lock and crash_nmi_callback() never returns,
> > CPU0 can deadlock when printk_safe_flush_on_panic() is called.
> >
> > When we hold the read_lock and then interrupted by the NMI, if the NMI
> > handler call nmi_panic(), it is also can lead to deadlock.
> >
> > In order to fix it, we make read_lock global and rename it to
> > safe_read_lock. And we handle safe_read_lock the same way in
> > printk_safe_flush_on_panic() as we handle logbuf_lock there.
>
> What about the following commit message? It uses imperative language
> and explains that the patch just prevents the deadlock. It removes
> some details. The diagram is better than many words.
>
> <commit message>
> printk_safe_flush_on_panic() caused the following deadlock on our server:
>
> CPU0: CPU1:
> panic rcu_dump_cpu_stacks
> kdump_nmi_shootdown_cpus nmi_trigger_cpumask_backtrace
> register_nmi_handler(crash_nmi_callback) printk_safe_flush
> __printk_safe_flush
> raw_spin_lock_irqsave(&read_lock)
> // send NMI to other processors
> apic_send_IPI_allbutself(NMI_VECTOR)
> // NMI interrupt, dead loop
> crash_nmi_callback
> printk_safe_flush_on_panic
> printk_safe_flush
> __printk_safe_flush
> // deadlock
> raw_spin_lock_irqsave(&read_lock)
>
> DEADLOCK: read_lock is taken on CPU1 and will never get released.
>
> It happens when panic() stops a CPU by NMI while it has been in
> the middle of printk_safe_flush().
>
> Handle the lock the same way as logbuf_lock. The printk_safe buffers
> are flushed only when both locks can be safely taken.
>
> Note: It would actually be safe to re-init the locks when all CPUs were
> stopped by NMI. But it would require passing this information
> from arch-specific code. It is not worth the complexity.
> Especially because logbuf_lock and printk_safe buffers have been
> obsoleted by the lockless ring buffer.
> </commit message>

Many thanks. It is clear.

>
> > Fixes: cf9b1106c81c ("printk/nmi: flush NMI messages on the system panic")
> > Signed-off-by: Muchun Song <songmuchun@xxxxxxxxxxxxx>
>
> With an updated commit message:
>
> Reviewed-by: Petr Mladek <pmladek@xxxxxxxx>

Thanks. I will update the commit message.

>
> Best Regards,
> Petr