Re: [PATCH v2 30/44] x86/unwind: add new unwind interface and implementations

From: Josh Poimboeuf
Date: Thu Aug 11 2016 - 15:15:29 EST


On Thu, Aug 11, 2016 at 11:58:13AM -0700, Andy Lutomirski wrote:
> On Aug 11, 2016 7:09 PM, "Josh Poimboeuf" <jpoimboe@xxxxxxxxxx> wrote:
> >
> > On Thu, Aug 11, 2016 at 07:58:34AM -0700, Andy Lutomirski wrote:
> > > On Thu, Aug 11, 2016 at 7:28 AM, Josh Poimboeuf <jpoimboe@xxxxxxxxxx> wrote:
> > > > On Thu, Aug 11, 2016 at 12:18:54AM -0700, Andy Lutomirski wrote:
> > > >> > > I admit I don't fully understand the use case. If someone wants to
> > > >> > > start a trace in the middle, shouldn't they just pass regs in?
> > > >> >
> > > >> > The regs aren't always available. Some callers just want to skip the
> > > >> > first few frames so the stack dump code itself isn't traced.
> > > >>
> > > >> I suspect that all users are okay with your algorithm simply because
> > > >> they don't switch stacks. Maybe the thing to do is to stop advancing
> > > >> when sp is passed or if the stack switches at all.
> > > >
> > > > There are actually some cases which could be broken by the sloppy "while
> > > > state->bp < sp" check. When starting with sp from 'regs->sp', sp often
> > > > points to a different stack than the current one. It seemed to work in
> > > > my testing, but I guess I got lucky, with irq percpu stack addresses
> > > > being smaller than the thread stack addresses.
> > > >
> > > >> Could you point me at a user that passes anything other than regs->sp
> > > >> here? On brief inspection, I haven't found any at all.
> > > >
> > > > For example, see show_stack().
> > >
> > > Is that a non-trivial case? show_stack() is generating sp *and* bp.
> > > Why not just pass both of them all the way in to show_trace_log_lvl
> > > (which the existing code already does) and then pass it into
> > > unwind_start?
> > >
> > > Alternatively, since that risks causing a bit of loss if you implement
> > > DWARF, you could add an unwind_start_here() function that captures the
> > > state in the calling function and pass the result all the way through.
> > > Or you could write a silly asm helper that literally fills in a struct
> > > pt_regs for the current context (although that could get a bit awkward
> > > for 32-bit, since pt_regs doesn't really contain sp there).
> > >
> > > If there are genuinely zero non-trivial cases, then this should fully
> > > solve the problem, no?
> >
> > Hm, what do you mean by non-trivial cases?
> >
> > As far as I can tell, they're all trivial, and we don't need the caller
> > to provide bp. Just start with the current frame's bp and unwind until
> > bp and sp are on the same stack and bp > sp.
>
> Given that your unwind-till-we-get-there algorithm isn't quite
> correct

But it *will* be correct in v3, with a minor change. Sneak preview:

/*
* The caller can optionally provide a stack pointer directly
* (first_sp) or indirectly (regs->sp), which indicates which stack
* frame to start unwinding at. Skip ahead until we reach that frame.
*/
while (!unwind_done(state) &&
(!on_stack(&state->stack_info, first_sp, sizeof(*first_sp) ||
state->bp < first_sp)))
unwind_next_frame(state);

> wouldn't it be easier to just pass all the needed information
> through? Especially since that information is there (in at least one
> case) on current kernels, so the diff would be smaller.

Why focus on diff size? Isn't it the resulting code that's most
important? It's not like removing the bp argument causes a big diff
anyway.

> It seems overcomplicated to me to regenerate the state that was
> already in the registers by unwinding to it rather than just passing
> it in to the unwinder directly.

Passing in the frame pointer when it isn't needed makes the interface
more complex. It creates more opportunity for the caller to mess things
up and more edge cases to consider in the unwinder.

And as you mentioned, it's not compatible with DWARF. Adding
unwind_start_here() would add even more interface complexity.

--
Josh