Re: [PATCH] printk/nmi: Prevent deadlock when serializing NMI backtraces

From: Sergey Senozhatsky
Date: Wed Jun 06 2018 - 06:34:08 EST


On (06/06/18 14:10), Sergey Senozhatsky wrote:
> On (06/05/18 14:47), Petr Mladek wrote:
> [..]
> > Grr, the ABBA deadlock is still there. NMIs are not sent to the other
> > CPUs atomically. Even if we detect that logbuf_lock is available
> > in printk_nmi_enter() on some CPUs, it might still get locked on
> > another CPU before the other CPU gets NMI.
>
> Can we do something about "B"? :) I mean - any chance we can rework
> locking in nmi_cpu_backtrace()?

Sorry, I don't have that much free time at the moment, so can't
fully concentrate on this issue.

Here is a quick-n-dirty thought.

The whole idea of printk_nmi() was to reduce the number of locks
we take performing printk() from NMI context - ideally down to 0.
We added logbuf spin_lock later on. One lock was fine. But then
we added another one - nmi_cpu_backtrace() lock. And two locks is
too many. So can we drop the nmi_cpu_backtrace() lock, and in
exchange extend printk-safe API with functions that will disable/enable
PRINTK_NMI_DEFERRED_CONTEXT_MASK on a particular CPU?

I refer to it as HARD and SOFT printk_nmi :) Just for fun. Hard one
has no right to use logbuf and will use only per-CPU buffer, while
soft one can use either per-CPU buffer or logbuf.

---

diff --git a/lib/nmi_backtrace.c b/lib/nmi_backtrace.c
index 0ace3c907290..b57d5daa90b5 100644
--- a/lib/nmi_backtrace.c
+++ b/lib/nmi_backtrace.c
@@ -90,11 +90,10 @@ void nmi_trigger_cpumask_backtrace(const cpumask_t *mask,

bool nmi_cpu_backtrace(struct pt_regs *regs)
{
- static arch_spinlock_t lock = __ARCH_SPIN_LOCK_UNLOCKED;
int cpu = smp_processor_id();

if (cpumask_test_cpu(cpu, to_cpumask(backtrace_mask))) {
- arch_spin_lock(&lock);
+ printk_nmi_hard();
if (regs && cpu_in_idle(instruction_pointer(regs))) {
pr_warn("NMI backtrace for cpu %d skipped: idling at %pS\n",
cpu, (void *)instruction_pointer(regs));
@@ -105,7 +104,7 @@ bool nmi_cpu_backtrace(struct pt_regs *regs)
else
dump_stack();
}
- arch_spin_unlock(&lock);
+ printk_nmi_sort();
cpumask_clear_cpu(cpu, to_cpumask(backtrace_mask));
return true;
}