On Wed, Nov 18, 2015 at 12:13:08PM +0800, Wangnan (F) wrote:
I think callchain value being 0 is an error or marker for the end of
On 2015/11/17 23:05, Jiri Olsa wrote:
From: Jiri Olsa <jolsa@xxxxxxxxxx>[SNIP]
As reported by Milian, currently for DWARF unwind (both libdw
and libunwind) we display callchain in callee order only.
Adding the support to follow callchain order setup to libunwind
DWARF unwinder, so we could get following output for report:
$ perf record --call-graph dwarf ls
...
$ perf report --no-children --stdio
39.26% ls libc-2.21.so [.] __strcoll_l
|
---__strcoll_l
mpsort_with_tmp
mpsort_with_tmp
sort_files
main
__libc_start_main
_start
0
$ perf report -g caller --no-children --stdio
...
39.26% ls libc-2.21.so [.] __strcoll_l
|
---0
_start
__libc_start_main
main
sort_files
mpsort_with_tmp
mpsort_with_tmp
__strcoll_l
Reported-by: Milian Wolff <milian.wolff@xxxxxxxx>
Based-on-patch-by: Milian Wolff <milian.wolff@xxxxxxxx>
Link: http://lkml.kernel.org/n/tip-lmtbeqm403f3luw4jkjevsi5@xxxxxxxxxxxxxx
Signed-off-by: Jiri Olsa <jolsa@xxxxxxxxxx>
---
tools/perf/util/unwind-libunwind.c | 47 ++++++++++++++++++++++++--------------
1 file changed, 30 insertions(+), 17 deletions(-)
diff --git a/tools/perf/util/unwind-libunwind.c b/tools/perf/util/unwind-libunwind.c
index 0ae8844fe7a6..705e1c19f1ea 100644
--- a/tools/perf/util/unwind-libunwind.c
+++ b/tools/perf/util/unwind-libunwind.c
- unw_get_reg(&c, UNW_REG_IP, &ip);In original code if ip == 0 entry() won't be called.
- ret = ip ? entry(ip, ui->thread, cb, arg) : 0;
+ if (callchain_param.order == ORDER_CALLER)But in new code event if ips[j] == 0 an entry will be built, which causes
+ j = max_stack - i - 1;
+ ret = entry(ips[j], ui->thread, cb, arg);
a behavior changes user noticable:
Before this patch:
# perf report --no-children --stdio --call-graph=callee
...
3.38% a.out a.out [.] funcc
|
---funcc
|
--2.70%-- funcb
funca
main
__libc_start_main
_start
After this patch:
# perf report --no-children --stdio --call-graph=callee
...
3.38% a.out a.out [.] funcc
|
---funcc
|
|--2.70%-- funcb
| funca
| main
| __libc_start_main
| _start
|
--0.68%-- 0
I'm not sure whether we can regard this behavior changing as a bugfix? I
think
there may be some reason the original code explicitly avoid creating an '0'
entry.
callchain. So it'd be better avoiding 0 entry.
But unfortunately, we have many 0 entries (and broken callchain after
them) with fp recording on optimized binaries. I think we should omit
those callchains.
Maybe something like this?
diff --git a/tools/perf/util/machine.c b/tools/perf/util/machine.c
index 5ef90be2a249..22642c5719ab 100644
--- a/tools/perf/util/machine.c
+++ b/tools/perf/util/machine.c
@@ -1850,6 +1850,15 @@ static int thread__resolve_callchain_sample(struct thread *thread,
#endif
ip = chain->ips[j];
+ /* callchain value inside zero page means it's broken, stop */
+ if (ip < 4096) {
+ if (callchain_param.order == ORDER_CALLER) {
+ callchain_cursor_reset(&callchain_cursor);
+ continue;
+ } else
+ break;
+ }
+
err = add_callchain_ip(thread, parent, root_al, &cpumode, ip);
if (err)