Re: [BUG?] perf: dwarf unwind doesn't work correctly on aarch64
From: Kim Phillips
Date: Thu Mar 23 2017 - 23:24:57 EST
On Thu, 23 Feb 2017 16:50:18 +0900
Masami Hiramatsu <mhiramat@xxxxxxxxxx> wrote:
[sorry for the delay, I just saw this]
> perf record -g dwarf (and perf report) doesn't show correct callchain
> on aarch64. Here is how to reproduce it.
...
> # Samples: 6K of event 'cpu-clock:u'
> # Event count (approx.): 1623750000
> #
> # Children Self Command Shared Object Symbol
> # ........ ........ ....... ............. ..........................
> #
> 17.21% 17.21% main main [.] func2
> |
> ---func2
>
> 17.09% 17.09% main main [.] func1
> |
> ---func1
>
> 16.67% 16.67% main main [.] main
> |
> ---main
> .....
>
> So, as you can see, the call graph reported each function has been
> called from itself. If I report it with fp as below, perf reported
> correct callgraph.
...
> I guess there is a bug in libunwind on aarch64 or we missed to pass
> the stack data to libunwind. (BTW, it works correctly on arm32)
Trying to replicate this on a debian 9 ("stretch") arm64 box:
Building acme's 'perf/urgent' branch (currently with the tag
perf-urgent-for-mingo-4.11-20170317), natively (cd tools; make clean;
make DEBUG=5 -C perf) shows this system has unwind support:
Auto-detecting system features:
... dwarf: [ on ]
... dwarf_getlocations: [ on ]
... glibc: [ on ]
... gtk2: [ on ]
... libaudit: [ on ]
... libbfd: [ on ]
... libelf: [ on ]
... libnuma: [ on ]
... numa_num_possible_cpus: [ on ]
... libperl: [ OFF ]
... libpython: [ on ]
... libslang: [ on ]
... libcrypto: [ on ]
... libunwind: [ on ]
... libdw-dwarf-unwind: [ on ]
... zlib: [ on ]
... lzma: [ on ]
... get_cpuid: [ OFF ]
... bpf: [ on ]
for which an apt search unwind returns the version:
libunwind-dev/testing,now 1.1-4.1 arm64 [installed]
library to determine the call-chain of a program - development
libunwind8/testing,now 1.1-4.1 arm64 [installed,automatic]
library to determine the call-chain of a program - runtime
continuing, and ignoring the no debug_frame support perf configure
mentions:
Makefile.config:421: No debug_frame support found in libunwind-aarch64
Makefile.config:480: No debug_frame support found in libunwind
$ ./perf --version
perf version 4.10.rc4.ge7ede72
$ gcc --version
gcc (Debian 6.3.0-6) 6.3.0 20170205
Copyright (C) 2016 Free Software Foundation, Inc.
This is free software; see the source for copying conditions. There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
$ gcc -O0 -ggdb3 -funwind-tables -o main main.c
$ ./perf record -g --call-graph dwarf,1024 -e cpu-clock:u -o /tmp/perf.data -- ./main
^C[ perf record: Woken up 121 times to write data ]
[ perf record: Captured and wrote 30.154 MB /tmp/perf.data (22975 samples) ]
$ ./perf --no-pager report -i /tmp/perf.data --stdio
# To display the perf.data header info, please use --header/--header-only options.
#
#
# Total Lost Samples: 0
#
# Samples: 22K of event 'cpu-clock:u'
# Event count (approx.): 5743750000
#
# Children Self Command Shared Object Symbol
# ........ ........ ....... ............. .....................
#
100.00% 8.14% main main [.] main
|
|--91.86%--main
| func0
| |
| --76.41%--func1
| |
| --60.82%--func2
| |
| --45.31%--func3
| |
| --30.17%--func4
| |
| --15.04%--func
|
--8.14%--__libc_start_main
main
...
which looks like it should, i.e., I can't reproduce.
You mentioned you're using the 'latest' sources for libunwind, etc.,
but can you provide more exact details like commit IDs, and what, if
anything, is being cross-built vs. native?
Thanks,
Kim