[BUG REPORT] perf tools: x86_64: Broken calllchain when sampling taken at 'callq' instruction

From: Wangnan (F)
Date: Wed Nov 18 2015 - 02:28:10 EST


Hi all,

When analysising Jiri's patchset [1] I found a dwarf unwind problem.
On x86 platform, when sample is at a 'callq' instruction, dwarf based
stack unwind always fail.

I compile a small C source file with debug information, turn off
frame pointer and disable optimization:

$ gcc -g -O0 -fomit-frame-pointer ./test_dwarf_unwind.c -o ./test_dwarf_unwind

Here is part of asm code. Please have a look at 40057d. Also, have a
look at 400534 and 4005bc, from these two instructions we know this
"funcc" need 0x38 bytes on stack, so we know where is the return address.

$ objdump -S ./test_dwarf_unwind
...
0000000000400534 <funcc>:
#include <sys/time.h>

static volatile int x = 0;

int funcc(void)
{
400534: 48 83 ec 38 sub $0x38,%rsp
struct timeval tv1, tv2;
unsigned long us1, us2;

gettimeofday(&tv1, NULL);
400538: 48 8d 44 24 10 lea 0x10(%rsp),%rax
...
400572: 48 89 e0 mov %rsp,%rax
400575: be 00 00 00 00 mov $0x0,%esi
40057a: 48 89 c7 mov %rax,%rdi
40057d: e8 7e fe ff ff callq 400400 <gettimeofday@plt> <---!!!
us2 = tv2.tv_sec * 1000000 + tv2.tv_usec;
400582: 48 8b 04 24 mov (%rsp),%rax
400586: 48 69 d0 40 42 0f 00 imul $0xf4240,%rax,%rdx
...
return x;
4005b6: 8b 05 74 04 20 00 mov 0x200474(%rip),%eax # 600a30 <x>
}
4005bc: 48 83 c4 38 add $0x38,%rsp
4005c0: c3 retq

Then use perf to trace it:

# ./pref record -g -F 9 --call-graph dwarf ./test_dwarf_unwind
# ./perf report --no-children --stdio
...
1.23% test_dwarf_unwi test_dwarf_unwind [.] funcc
|
---funcc
|
|--0.62%-- funcb
| funca
| main
| __libc_start_main
| _start
|
--0.62%-- 0 <-- I applied Jiri's patch in [1] so see this 0,
which indicates a broken call chain.

From 'perf script' result I found the two samples inside funcc:

# ./perf script
...
test_dwarf_unwi 25232 2201855.424651: 443365670 cycles:pp:
4005a4 funcc (/home/w00229757/test_dwarf_unwind)
4005ca funcb (/home/w00229757/test_dwarf_unwind)
4005d8 funca (/home/w00229757/test_dwarf_unwind)
4005e6 main (/home/w00229757/test_dwarf_unwind)
7facc5c2ebd5 __libc_start_main (/tmp/oxygen_root-w00229757/lib64/libc-2.18.so)
400459 _start (/home/w00229757/test_dwarf_unwind)
...
test_dwarf_unwi 25232 2201856.647088: 443268987 cycles:pp:
40057d funcc (/home/w00229757/test_dwarf_unwind)
0 [unknown] ([unknown])

With perf report -D I can manually check their stack information manually.


This is stack of sample at 4005a4:

# ./perf report -D
...
0x1d9c8 [0x20e8]: event: 9
.
. ... raw event: size 8424 bytes
. 0000: 09 00 00 00 02 40 e8 20 a4 05 40 00 00 00 00 00 .....@. ..@.....
...
. 00d0: 00 00 00 00 00 00 00 00 00 20 00 00 00 00 00 00 ......... ......
. 00e0: 2b f6 4c 56 00 00 00 00 46 52 02 00 00 00 00 00 +.LV....FR......
. 00f0: 29 f6 4c 56 00 00 00 00 70 e4 0c 00 00 00 00 00 ).LV....p.......
. 0100: 06 f3 3b d7 d7 24 05 00 b0 00 28 d7 d7 24 05 00 ..;..$....(..$..
. 0110: 00 00 00 00 00 00 00 00 ca 05 40 00 00 00 00 00
...
. 20e0: b0 06 00 00 00 00 00 00 ........
.
2201855424651081 0x1d9c8 [0x20e8]: PERF_RECORD_SAMPLE(IP, 0x4002): 25232/25232: 0x4005a4 period: 443365670 addr: 0
... FP chain: nr:0
... user regs: mask 0xff0fff ABI 64-bit
.......
.... IP 0x4005a4
.......
... ustack: size 1712, offset 0xd8
....

From the above information we can check its stack from 0xe0 (stack size
reside at 0xd8 as a u64 so stack data start at 0xe0). Based on information
we get from the above asm code, we know the return address should at the
0x38 of the stack, which is 0x4005ca.

This is stack of sample at 40057d:

0x343c0 [0x20e8]: event: 9
.
. ... raw event: size 8424 bytes
. 0000: 09 00 00 00 02 40 e8 20 7d 05 40 00 00 00 00 00 .....@. }.@.....
...
. 00d0: 00 00 00 00 00 00 00 00 00 20 00 00 00 00 00 00 ......... ......
. 00e0: 82 05 40 00 00 00 00 00 2c f6 4c 56 00 00 00 00 ..@.....,.LV....
. 00f0: 84 b7 05 00 00 00 00 00 29 f6 4c 56 00 00 00 00 ........).LV....
. 0100: 70 e4 0c 00 00 00 00 00 84 9a 4e d7 d7 24 05 00 p.........N..$..
. 0110: b0 00 28 d7 d7 24 05 00 00 00 00 00 00 00 00 00 ..(..$..........
. 0120: ca 05 40 00 00 00 00 00 f0 05 40 00 00 00 00 00 ..@.......@.....
...
. 20e0: b8 06 00 00 00 00 00 00 ........
.
2201856647088808 0x343c0 [0x20e8]: PERF_RECORD_SAMPLE(IP, 0x4002): 25232/25232: 0x40057d period: 443268987 addr: 0
... FP chain: nr:0
... user regs: mask 0xff0fff ABI 64-bit
......
.... IP 0x40057d
......
... ustack: size 1720, offset 0xd8
.....

By manually checking user stack from 0xe0, we can see problem that: although the IP
is still at 'callq', that instruction seems already take effect that, it has
already pushes its return address (0x400582) onto stack.

So there must be somewhere the pt_regs or the stack is adjusted in kernel.
Does anyone can give some hint on it?

Thank you.

[1] http://lkml.kernel.org/r/1447772739-18471-1-git-send-email-jolsa@xxxxxxxxxx

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/