Re: [PATCH] kretprobe: produce sane stack traces

From: Aleksa Sarai
Date: Thu Nov 01 2018 - 16:26:06 EST


On 2018-11-02, Masami Hiramatsu <mhiramat@xxxxxxxxxx> wrote:
> On Thu, 1 Nov 2018 21:49:48 +1100
> Aleksa Sarai <cyphar@xxxxxxxxxx> wrote:
>
> > On 2018-11-01, Masami Hiramatsu <mhiramat@xxxxxxxxxx> wrote:
> > > > > > Anyway, until that merge happens, this patch looks good to avoid
> > > > > > this issue for generic solution (e.g. for the arch which doesn't
> > > > > > supports retstack).
> > > > >
> > > > > I think its time to come up with an algorithm that makes function graph
> > > > > work with multiple users, and have kretprobes be able to hook into it
> > > > > just like kprobes hooks into function tracer.
> > > > >
> > > > > I have some ideas on how to get this done, and will try to have an RFC
> > > > > patch set ready by plumbers.
> > > >
> > > > Should I continue working on this patchset?
> > >
> > > Yes, until we finally introduce Steven's algorithm on all arch (yeah, we still
> > > have some archs which don't support graph-tracer but supports kprobes),
> > > I think your patch is the best fix for this issue.
> >
> > Thanks, I just sent a v3.
> >
> > Though, even with Steven's hooking of kprobes I think you'd still need
> > to stash away the stack trace somewhere (or am I misunderstanding the
> > proposal?).
>
> Wait, I might miss something. kretprobe and func-graph tracer just swap the
> return address on the stack but no change on other stuffs on the stack.
> In that case, if we can restore the stack, isn't it enough?

If stack traces works within the func-graph tracer context, then that
would be enough.

However, this is not currently the case in regular kretprobes -- the
issue isn't that kretprobe_trampoline is *on* the stack trace but rather
that the stack trace is *only* kretprobe_trampoline.

I spent several days (and got nowhere) trying to understand why the
stack unwinder cannot go past kretprobe_trampoline. You can see this if
you try to use bpftrace (which just runs bpf_get_stackid ->
perf_callchain_kernel):

% bpftrace -e 'kretprobe:do_filp_open { @x = stack; }'
Attaching 1 probe...
^C

@x:
kretprobe_trampoline+1

With my patch, you see the full stack trace you expect. It should be
noted the same problem exists when adding a kretprobe with ftrace's
kprobe_events and options/stacktrace.

Weirdly, after adding show_stack(NULL, NULL) from within a kretprobe
context you get an answer beyond kretprobe_trampoline (though it is
quite ugly and contains lots of information you don't see in
get_perf_callchain or save_stack_trace -- for some reason). I will take
a closer look at this.

But it seems to me that some aspect of the stack trace handling with
kretprobe_trampoline is broken. I wouldn't mind working on fixing this,
but it might require lots of architecture-specific work (and it should
be noted that ultimately we want to have an identical stack trace
between kprobes and kretprobe so that you can DTrace-style aggregate
them).

> And anyway, even if using func-graph tracer, stack unwinding works correctly.
> I thought it means we don't need to backup whole the stack, doesn't it?

Currently, for kretprobes, the entire stack needs to be backed up
because the unwinder stops when it hits kretprobe_trampoline.

--
Aleksa Sarai
Senior Software Engineer (Containers)
SUSE Linux GmbH
<https://www.cyphar.com/>

Attachment: signature.asc
Description: PGP signature