Re: Getting empty callchain from perf_callchain_kernel()

From: Josh Poimboeuf
Date: Wed Jun 12 2019 - 10:54:51 EST


On Wed, Jun 12, 2019 at 10:54:23AM +0200, Peter Zijlstra wrote:
> On Tue, Jun 11, 2019 at 10:05:01PM -0500, Josh Poimboeuf wrote:
> > On Fri, May 24, 2019 at 10:53:19AM +0200, Peter Zijlstra wrote:
> > > > For ORC, I'm thinking we may be able to just require that all generated
> > > > code (BPF and others) always use frame pointers. Then when ORC doesn't
> > > > recognize a code address, it could try using the frame pointer as a
> > > > fallback.
> > >
> > > Yes, this seems like a sensible approach. We'd also have to audit the
> > > ftrace and kprobe trampolines, IIRC they only do framepointer setup for
> > > CONFIG_FRAME_POINTER currently, which should be easy to fix (after the
> > > patches I have to fix the FP generation in the first place:
> > >
> > > https://git.kernel.org/pub/scm/linux/kernel/git/peterz/queue.git/log/?h=x86/wip
> >
> > Right now, ftrace has a special hook in the ORC unwinder
> > (orc_ftrace_find). It would be great if we could get rid of that in
> > favor of the "always use frame pointers" approach. I'll hold off on
> > doing the kpatch/kprobe trampoline conversions in my patches since it
> > would conflict with yours.
> >
> > Though, hm, because of pt_regs I guess ORC would need to be able to
> > decode an encoded frame pointer? I was hoping we could leave those
> > encoded frame pointers behind in CONFIG_FRAME_POINTER-land forever...
>
> Ah, I see.. could a similar approach work for the kprobe trampolines
> perhaps?

If you mean requiring that kprobes trampolines always use frame
pointers, I think it should work.

> > Here are my latest BPF unwinder patches in case anybody wants a sneak
> > peek:
> >
> > https://git.kernel.org/pub/scm/linux/kernel/git/jpoimboe/linux.git/log/?h=bpf-orc-fix
>
> On a quick read-through, that looks good to me. A minor nit:
>
> /* mov dst_reg, %r11 */
> EMIT_mov(dst_reg, AUX_REG);
>
> The disparity between %r11 and AUX_REG is jarring. I understand the
> whole bpf register mapping thing, but it is just weird when reading
> this.

True, but there are several cases where the r11 is hard-coded in the
instruction encoding itself, like:

/* mov imm32, %r11 */
EMIT3_off32(0x49, 0xC7, 0xC3, imm32);

If the code were more decoupled, like if it had helpers where you could
always just pass AUX_REG, and the code never had to know what the value
of AUX_REG is, then using "AUX_REG" in the comments would make sense.

But since there are inconsistencies, with hard-coded register mapping
knowledge in many places, I find it easier to follow what's going on
when the specific register name is always shown in the comments.

> Other than that, the same note as before, the 32bit JIT still seems
> buggered, but I'm not sure you (or anybody else) cares enough about that
> to fix it though. It seems to use ebp as its own frame pointer, which
> completely defeats an unwinder.

I'm still trying to decide if I care about 32-bit. It does indeed use
ebp everywhere. But I'm not sure if I want to poke the beehive... Also
factoring into the equation is the fact that I'll be on PTO next week
:-) If I have time in the next couple days then I may take a look.

--
Josh