Re: [PATCH] arm: Fix backtrace generation when IPI is masked

From: Daniel Thompson
Date: Tue Sep 15 2015 - 04:11:16 EST


On 15/09/15 07:58, Hillf Danton wrote:
Currently on ARM when <SysRq-L> is triggered from an interrupt handler
(e.g. a SysRq issued using UART or kbd) the main CPU will wedge for ten
seconds with interrupts masked before issuing a backtrace for every CPU
except itself.

The new backtrace code introduced by commit 96f0e00378d4 ("ARM: add
basic support for on-demand backtrace of other CPUs") does not work
correctly when run from an interrupt handler because IPI_CPU_BACKTRACE
is used to generate the backtrace on all CPUs but cannot preempt the
current calling context.

This can be fixed by detecting that the calling context cannot be
preempted and issuing the backtrace directly in this case. Some small
changes to the generic code are required to support this.

Signed-off-by: Daniel Thompson <daniel.thompson at linaro.org>
---
arch/arm/kernel/smp.c | 7 +++++++
lib/nmi_backtrace.c | 5 ++++-
2 files changed, 11 insertions(+), 1 deletion(-)

diff --git a/arch/arm/kernel/smp.c b/arch/arm/kernel/smp.c
index 48185a773852..4d8a80328c74 100644
--- a/arch/arm/kernel/smp.c
+++ b/arch/arm/kernel/smp.c
@@ -748,6 +748,13 @@ core_initcall(register_cpufreq_notifier);

static void raise_nmi(cpumask_t *mask)
{
+ /*
+ * Generate the backtrace directly if we are running in a
+ * calling context that is not preemptible by the backtrace IPI.
+ */
+ if (cpumask_test_cpu(smp_processor_id(), mask) && irqs_disabled())
+ nmi_cpu_backtrace(NULL);
+
smp_cross_call(mask, IPI_CPU_BACKTRACE);
}

diff --git a/lib/nmi_backtrace.c b/lib/nmi_backtrace.c
index 88d3d32e5923..be0466a80d0b 100644
--- a/lib/nmi_backtrace.c
+++ b/lib/nmi_backtrace.c
@@ -149,7 +149,10 @@ bool nmi_cpu_backtrace(struct pt_regs *regs)
/* Replace printk to write into the NMI seq */
this_cpu_write(printk_func, nmi_vprintk);
pr_warn("NMI backtrace for cpu %d\n", cpu);
- show_regs(regs);
+ if (regs)
+ show_regs(regs);
+ else
+ dump_stack();

Better if dump_stack() is added in a separate patch, given that
it is not mentioned in commit message.

Adding dump_stack() is mentioned in passing ("Some small changes to the generic code are required to support this.") but you're right that the reason for the change is not explicitly called out.

I can certainly respin as two patches but perhaps its better just to improve the commit message. Something like:

> This can be fixed by detecting that the calling context cannot be
> preempted and issuing the backtrace directly in this case. Issuing
> directly leaves us without any pt_regs to pass to nmi_cpu_backtrace().
> Modify the generic code to call dump_stack() when its argument is
> NULL.

Which do you prefer?


Daniel.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/