Re: ORC unwinder and 'reliable' flag to printk_stack_address()

From: Josh Poimboeuf
Date: Fri Sep 08 2017 - 09:16:55 EST


On Fri, Sep 08, 2017 at 01:12:45PM +0200, Jiri Kosina wrote:
> On Wed, 6 Sep 2017, Josh Poimboeuf wrote:
>
> > > I just got the below stack trace with current Linus' tree with ORC
> > > unwinder enabled:
> > >
> > > [ 8.652765] Call Trace:
> > > [ 8.652767] dump_stack+0x7c/0xbf
> > > [ 8.652769] print_circular_bug+0x2d3/0x2e0
> > > [ 8.652771] check_prev_add+0x666/0x700
> > > [ 8.652772] ? print_bfs_bug+0x40/0x40
> > > [ 8.652775] lock_commit_crosslock+0x3f1/0x570
> > > [ 8.652777] complete+0x24/0x60
> > > [ 8.652779] __kthread_parkme+0x42/0x90
> > > [ 8.652780] smpboot_thread_fn+0x92/0x210
> > > [ 8.652782] kthread+0x145/0x180
> > > [ 8.652783] ? sort_range+0x20/0x20
> > > [ 8.652785] ? kthread_create_on_node+0x40/0x40
> > > [ 8.652787] ret_from_fork+0x2a/0x40
> > >
> > > Please note the kthread_create_on_node(), sort_range() and print_bfs_bug()
> > > entries ... I believe they actually shouldn't be there at all. All of them
> > > are at the last byte belonging to the function. Am I missing something?
> >
> > The question marks are still supposed to be there. They show any text
> > addresses found on the stack that weren't otherwise found by the
> > unwinder. 99.9% of the time, they're left over from a previous call
> > chain, and should be ignored.
>
> It's interesting though that all of them (and even in other instances I've
> encountered in the meantime) are always at the "last address" of the
> function address range. That's hardly a conincidence, and also hardly a
> result of a legitimate stacktrace -- 'call' is rather unlikely to be the
> last insn in a function.

Ah, I missed that detail. In this case, these aren't actually bread
crumbs from a previous stack. They're actually function pointers which
are saved on the current stack.

In my vmlinux, the function which comes after kthread_create_on_node()
is kthread(). Which means kthread_create_on_node+0x40/0x40 is the same
address as kthread+0x0. So it's the 'kthread' function pointer.
Similarly, sort_range+0x20/0x20 is the same as smpboot_thread_fn+0x0.

So the 'kthread' and 'smpboot_thread_fn' function pointers are getting
saved on the stack in kthread()'s frame. The oops dump code in
show_trace_log_lvl() sees those text addresses and prints them as bread
crumbs.

So your next question might be, why is it printed as

kthread_create_on_node+0x40/0x40

instead of

kthread+0x0/0x18e

?

The reason is, there are some "noreturn" functions which never return,
like panic(). They sometimes have a CALL instruction at the very end.
So if kthread_create_on_node()'s last instruction were a CALL, we'd want
to show *that* function name instead of kthread().

So when printing splats, we always assume that the boundary between the
end of a function and the beginning of a function belongs to the
previous function. That's what the printk '%pB' specifier does.

The only exception to that is when printing regs->ip. In that case, the
interrupt could have occurred right at the beginning of a function. So
we would print 'kthread+0x0'. Which is what the '%pS' specifier does.

--
Josh