Re: [tip:core/debug] debug lockups: Improve lockup detection

From: Andrew Morton
Date: Sun Aug 02 2009 - 14:46:41 EST


On Sun, 2 Aug 2009 13:09:34 GMT tip-bot for Ingo Molnar <mingo@xxxxxxx> wrote:

> Commit-ID: c1dc0b9c0c8979ce4d411caadff5c0d79dee58bc
> Gitweb: http://git.kernel.org/tip/c1dc0b9c0c8979ce4d411caadff5c0d79dee58bc
> Author: Ingo Molnar <mingo@xxxxxxx>
> AuthorDate: Sun, 2 Aug 2009 11:28:21 +0200
> Committer: Ingo Molnar <mingo@xxxxxxx>
> CommitDate: Sun, 2 Aug 2009 13:27:17 +0200
>
> --- a/drivers/char/sysrq.c
> +++ b/drivers/char/sysrq.c
> @@ -24,6 +24,7 @@
> #include <linux/sysrq.h>
> #include <linux/kbd_kern.h>
> #include <linux/proc_fs.h>
> +#include <linux/nmi.h>
> #include <linux/quotaops.h>
> #include <linux/perf_counter.h>
> #include <linux/kernel.h>
> @@ -222,12 +223,7 @@ static DECLARE_WORK(sysrq_showallcpus, sysrq_showregs_othercpus);
>
> static void sysrq_handle_showallcpus(int key, struct tty_struct *tty)
> {
> - struct pt_regs *regs = get_irq_regs();
> - if (regs) {
> - printk(KERN_INFO "CPU%d:\n", smp_processor_id());
> - show_regs(regs);
> - }
> - schedule_work(&sysrq_showallcpus);
> + trigger_all_cpu_backtrace();
> }

I think this just broke all non-x86 non-sparc SMP architectures.

> static struct sysrq_key_op sysrq_showallcpus_op = {
> diff --git a/kernel/rcutree.c b/kernel/rcutree.c
> index 7717b95..9c5fa9f 100644
> --- a/kernel/rcutree.c
> +++ b/kernel/rcutree.c
> @@ -35,6 +35,7 @@
> #include <linux/rcupdate.h>
> #include <linux/interrupt.h>
> #include <linux/sched.h>
> +#include <linux/nmi.h>
> #include <asm/atomic.h>
> #include <linux/bitops.h>
> #include <linux/module.h>
> @@ -469,6 +470,8 @@ static void print_other_cpu_stall(struct rcu_state *rsp)
> }
> printk(" (detected by %d, t=%ld jiffies)\n",
> smp_processor_id(), (long)(jiffies - rsp->gp_start));
> + trigger_all_cpu_backtrace();

Be aware that trigger_all_cpu_backtrace() is a PITA when you have a lot
of CPUs.

If a callsite is careful to ensure that the most important information
is emitted last then that might improve things.

otoh, log buffer overflow will truncate, I think. So that info needs
to be emitted first too ;)

It's a PITA.


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/