Re: [PATCH] x86/dumpstack: Walk frames when built with frame pointers

From: Linus Torvalds
Date: Sun Apr 27 2014 - 16:08:54 EST


On Sun, Apr 27, 2014 at 5:08 AM, Ingo Molnar <mingo@xxxxxxxxxx> wrote:
>
> So it's useful information for hairy bugs and it would be sad to
> remove them.

I tend to agree. I've often found the left-overs to be good clues
about what just got called. Although equally often it's another kind
of clue entirely: that the stack frame of some of the functions
involved in the real call frame is much too big, leaving room for that
stale information to lay around.

> Having said that, your complaint that '?' entries can make reading of
> back traces more difficult is valid as well - so maybe we can do
> something about that.

Quite frankly, I'd much rather just remove the annoying hex numbers
that are imnsho *much* more distracting. Possibly even the "/0xsize"
part (although that is at least somewhat useful to judge where in the
function it is).

And while it would be horrible for readability, it might also be a
good idea to replace the newlines with something like " -> " instead,
because we are quite often vertically challenged. But that could
really make things pretty unreadable.

So to take your example, it might be something like this

arch_trigger_all_cpu_backtrace+0x3c -> do_raw_spin_lock+0xb7
-> _raw_spin_lock_irqsave+0x35 -> ? prepare_to_wait+0x18
-> prepare_to_wait+0x18 -> ? generic_make_request+0x80
-> ? unmap_underlying_metadata+0x2e -> __wait_on_bit+0x20
-> ? submit_bio+0xd2 -> out_of_line_wait_on_bit+0x54
-> ? unmap_underlying_metadata+0x2e -> ? autoremove_wake_function+0x31
-> __wait_on_buffer+0x1b -> __ext3_get_inode_loc+0x1ef -> ext3_iget+0x45
-> ext3_lookup+0x97 -> lookup_real+0x20 -> __lookup_hash+0x2a
-> lookup_slow+0x36 -> path_lookupat+0xf9 -> filename_lookup+0x1f
-> user_path_at_empty+0x3f -> user_path_at+0xd -> vfs_fstatat+0x40
-> ? lg_local_unlock+0x31 -> vfs_stat+0x13 -> sys_stat64+0x11
-> ? __fput+0x187 -> ? restore_all+0xf -> ? trace_hardirqs_on_thunk+0xc
-> syscall_call+0x7

which is admittedly complete line noise, but is just 13 lines rather
than 31. That can sometimes be a really big deal.

Also, we might want to cap the number of lines regardless. It is true
that sometimes the really deep call chains can be interesting, but
equally often they make other important stuff scroll off the screen
(oopses that don't get caught in /sys/log/messages because they kill
the machine are the worst to debug, and we still end up having people
send pictures taken with digital cameras of them), so it's a "win
some, lose some" kind of thing.

Of course, the questionable stale entries on the stack can (and do)
make the whole "scroll off the screen" thing worse. So I dunno.

Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/