Re: 2.6.23 regression: accessing invalid mmap'ed memory from gdbcauses unkillable spinning

From: David Miller
Date: Wed Oct 31 2007 - 02:56:17 EST


From: Nick Piggin <npiggin@xxxxxxx>
Date: Wed, 31 Oct 2007 07:42:21 +0100

> Sysrq+T fails to show the stack trace of a running task. Presumably this
> is to avoid a garbled stack, however it can often be useful, and besides
> there is no guarantee that the task won't start running in the middle of
> show_stack(). If there are any correctness issues, then the archietcture
> would have to take further steps to ensure the task is not running.
>
> Signed-off-by: Nick Piggin <npiggin@xxxxxxx>

This is useful.

Even more useful would be a show_regs() on the cpu where running tasks
are running. If not a full show_regs() at least a program counter.

That's usually what you're trying to debug and we provide nearly no
way to handle: some task is stuck in a loop in kernel mode and you
need to know exactly where that is.

This is pretty easy to do on sparc64. In fact I can capture remote
cpu registers even when that CPU's interrupts are disabled. I suppose
other arches could do a NMI'ish register capture like this as well.

I have a few bug reports that I can't make more progress on because I
currently can't ask users to do something to fetch the registers on
the seemingly hung processor. This is why I'm harping on this so
much :-)

Anyways, my core suggestion is to add a hook here so platforms can
do the remote register fetch if they want.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/