Re: [PATCH 10/19] x86/dumpstack: add get_stack_info() interface

From: Josh Poimboeuf
Date: Sat Jul 23 2016 - 09:09:42 EST


On Fri, Jul 22, 2016 at 04:52:10PM -0700, Andy Lutomirski wrote:
> On Fri, Jul 22, 2016 at 4:26 PM, Andy Lutomirski <luto@xxxxxxxxxxxxxx> wrote:
> > On Thu, Jul 21, 2016 at 2:21 PM, Josh Poimboeuf <jpoimboe@xxxxxxxxxx> wrote:
> >> valid_stack_ptr() is buggy: it assumes that all stacks are of size
> >> THREAD_SIZE, which is not true for exception stacks. So the
> >> walk_stack() callbacks will need to know the location of the beginning
> >> of the stack as well as the end.
> >>
> >> Another issue is that in general the various features of a stack (type,
> >> size, next stack pointer, description string) are scattered around in
> >> various places throughout the stack dump code.
> >
> > I finally figured out what visit_info is. But would it make more
> > sense to track it in the unwind state so that the unwinder can
> > directly make sure it doesn't start looping?
> >
>
> I just realized that it *is* in the unwind state. But maybe this code
> in update_stack_state:
>
> sp = info->next;
> if (!sp).
> goto unknown;
>
> if (get_stack_info(sp, state->task, info, &state->stack_mask))
> goto unknown;
>
> if (!on_stack(info, addr, len))
> goto unknown;
>
> should do something like:
>
> if (get_stack_info(addr, ...))
> goto unknown.
>
> sp = info->end;
>
> instead. Alternatively, maybe it would make sense to keep sp as is
> (have update_stack_state return bool instead of returning a pointer)
> so that a frame that switches stacks still shows the actual sp at the
> time that the frame called whatever the it called.
>
> I'm really quite confused by what state->sp means in your current
> code. In the non-stack-switching case (everything is on the thread
> stack), it appears to always match state->bp. Am I missing something?
> If I'm understanding this correctly, when you're pointing at a call
> frame, state->bp is that frame's base address (the top of the stack
> frame), unwind_get_return_address() returns the address to which that
> frame would return, and, in the future, unwind_get_gpr(UNWIND_DI) or
> whatever it ends up looking like will return RDI at the time that the
> frame called whatever function it called, if known. By that logic,
> shouldn't state->sp be sp on entry to the call instruction? (Or could
> sp just be removed? Does it do anything?)

Yeah, I think sp has no purpose and can actually just be removed.

(It was leftover from a previous iteration of the code where it did have
a purpose and I forgot to remove it.)

> I guess the reason I'm still not 100% comfortable with the idea that
> pt_regs frames don't exist a real frames is that I keep waffling as to
> how I should think about the regs associated with a frame in the
> future DWARF world. I think I imagine them being the regs at the time
> that the frame did it's call to the next frame, which, by an
> admittedly weak analogy, means that the pt_regs state would be the
> regs at the time that the exception or interrupt happened. That
> offers a third silly option for dealing with the annoying '?': emit
> two frames for a pt_regs, but have the frame in the entry code show
> NULL for its return address because it's not a normal return.

Well, I'd say let's not get ahead of ourselves. I think the current
regs-aren't-a-frame design works fine for now, and the code is fairly
simple. If/when we get a DWARF unwinder, we can revisit that decision.

--
Josh