Re: [patch, 2.6.17-rc3-mm1] i386: break out of recursion in stackframe walk

From: Keith Owens
Date: Wed May 03 2006 - 01:08:41 EST


Ingo Molnar (on Tue, 2 May 2006 11:50:34 +0200) wrote:
>if CONFIG_FRAME_POINTERS is enabled, and one does a dump_stack() during
>early SMP init, an infinite stackdump and a bootup hang happens:
>
> [<c0104e7f>] show_trace+0xd/0xf
> [<c0104e96>] dump_stack+0x15/0x17
> [<c01440df>] save_trace+0xc3/0xce
> [<c014527d>] mark_lock+0x8c/0x4fe
> [<c0145df5>] __lockdep_acquire+0x44e/0xaa5
> [<c0146798>] lockdep_acquire+0x68/0x84
> [<c1048699>] _spin_lock+0x21/0x2f
> [<c010d918>] prepare_set+0xd/0x5d
> [<c010daa8>] generic_set_all+0x1d/0x201
> [<c010ca9a>] mtrr_ap_init+0x23/0x3b
> [<c010ada8>] identify_cpu+0x2a7/0x2af
> [<c01192a7>] smp_store_cpu_info+0x2f/0xb4
> [<c01197d0>] start_secondary+0xb5/0x3ec
> [<c104ec11>] end_of_stack_stop_unwind_function+0x1/0x4
> [<c104ec11>] end_of_stack_stop_unwind_function+0x1/0x4
> [<c104ec11>] end_of_stack_stop_unwind_function+0x1/0x4
> [<c104ec11>] end_of_stack_stop_unwind_function+0x1/0x4
> [<c104ec11>] end_of_stack_stop_unwind_function+0x1/0x4
> [<c104ec11>] end_of_stack_stop_unwind_function+0x1/0x4
> [<c104ec11>] end_of_stack_stop_unwind_function+0x1/0x4
> [<c104ec11>] end_of_stack_stop_unwind_function+0x1/0x4
> [...]
>
>due to "end_of_stack_stop_unwind_function" recursing back to itself in
>the EBP stackframe-walker. So avoid this type of recursion when walking
>the stack .
>
>Signed-off-by: Ingo Molnar <mingo@xxxxxxx>
>
>Index: linux/arch/i386/kernel/traps.c
>===================================================================
>--- linux.orig/arch/i386/kernel/traps.c
>+++ linux/arch/i386/kernel/traps.c
>@@ -150,6 +150,12 @@ static inline unsigned long print_contex
> while (valid_stack_ptr(tinfo, (void *)ebp)) {
> addr = *(unsigned long *)(ebp + 4);
> printed = print_addr_and_symbol(addr, log_lvl, printed);
>+ /*
>+ * break out of recursive entries (such as
>+ * end_of_stack_stop_unwind_function):
>+ */
>+ if (ebp == *(unsigned long *)ebp)
>+ break;
> ebp = *(unsigned long *)ebp;
> }
> #else

KDB just limits kernel traces to a maximum of 200 entries, which
catches direct as well as indirect recursion. IA64 is notorious for
getting loops in its unwind data, sometime looping over three or four
functions. Checking for a maximum number of entries is a simple and
architecture independent check.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/