Re: 2.6.23 regression: accessing invalid mmap'ed memory from gdb causes unkillable spinning

From: Nick Piggin
Date: Wed Oct 31 2007 - 03:41:20 EST


On Tue, Oct 30, 2007 at 11:56:00PM -0700, David Miller wrote:
> From: Nick Piggin <npiggin@xxxxxxx>
> Date: Wed, 31 Oct 2007 07:42:21 +0100
>
> > Sysrq+T fails to show the stack trace of a running task. Presumably this
> > is to avoid a garbled stack, however it can often be useful, and besides
> > there is no guarantee that the task won't start running in the middle of
> > show_stack(). If there are any correctness issues, then the archietcture
> > would have to take further steps to ensure the task is not running.
> >
> > Signed-off-by: Nick Piggin <npiggin@xxxxxxx>
>
> This is useful.
>
> Even more useful would be a show_regs() on the cpu where running tasks
> are running. If not a full show_regs() at least a program counter.

Yes, that's something I've needed as well. And I doubt that we're
very unusual among kernel developers in this regard ;)


> That's usually what you're trying to debug and we provide nearly no
> way to handle: some task is stuck in a loop in kernel mode and you
> need to know exactly where that is.
>
> This is pretty easy to do on sparc64. In fact I can capture remote
> cpu registers even when that CPU's interrupts are disabled. I suppose
> other arches could do a NMI'ish register capture like this as well.
>
> I have a few bug reports that I can't make more progress on because I
> currently can't ask users to do something to fetch the registers on
> the seemingly hung processor. This is why I'm harping on this so
> much :-)
>
> Anyways, my core suggestion is to add a hook here so platforms can
> do the remote register fetch if they want.

You could possibly even do a generic "best effort" kind of thing with
regular IPIs, that will timeout and continue if some CPUs don't handle
them, and should be pretty easy to get working with existing smp_call_
function stuff. Not exactly clean, but it would be better than nothing.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/