Re: [PATCH 05/19] x86/dumpstack: fix function graph tracing stack dump reliability issues

From: Steven Rostedt
Date: Tue Aug 02 2016 - 19:16:38 EST


On Tue, 2 Aug 2016 17:13:59 -0500
Josh Poimboeuf <jpoimboe@xxxxxxxxxx> wrote:


> > Then we only need the fp use case when FRAME_POINTER is not set. As
> > mcount forces FRAME_POINTER, we only need to worry about the fentry
> > case.
>
> Hm, I'm confused. First, I don't see where mcount forces FRAME_POINTER.

Hmm, we should probably force it generally, as gcc itself requires
mcount to be used with framepointers. -mcount can't be used without
them.

>
> Second, I don't see why that even matters. If mcount and frame pointers
> are enabled, then the 'fp' field of ftrace_ret_stack is needed for the
> gcc sanity check, right? So we couldn't override 'fp', and the old
> "stateful index" version of ftrace_graph_ret_addr() would have to be
> used in the code above for reliable addresses, and we'd still have the
> same out-of-sync bug.
>
> Or am I missing something?
>

Or I missed something. How did we get out of sync? If we have frame
pointers, shouldn't the "return_to_handler" be seen as reliable by the
code (not that we save it as such)? That is, if the frame pointer shows
that the next function is return_to_handler, then we increment the
index into ret_stack, otherwise we simply record the return_to_handler
as a normal "unreliable" function, without any processing of it.

I guess I don't actually understand how the NMI screwed it up, as
function graph doesn't trace "do_nmi()" itself nor anything before that.
I'm guessing it really got out of sync because there's a
"return_to_handler" in the stack that wasn't really called (not a frame
pointer). The ftrace_graph_ret_addr() will shift the index currently
regardless if the return_to_handler found is part of a stack frame, or
just left over in the stack. THAT is why I think it got out of sync.

-- Steve