Re: [PATCH/RFC] perf report: Add option to change offset format when address is specified as a sort_key
From: Arnaldo Carvalho de Melo
Date: Tue Dec 26 2017 - 09:52:01 EST
Em Fri, Dec 15, 2017 at 03:35:10PM +0000, Aaron Tomlin escreveu:
> With --call-graph option, a user can choose to display call chains with a
> specific sort_key. The sort_key can be either: function, address or srcline.
> By default, when the address is specified as the sort_key, the offset (i.e.
> symbol + offset) is provided in decimal (base 10) form.
> This is because in __get_srcline() we use PRIu64 to display the offset.
> This may or may not be desirable.
>
> This patch adds a new option --hex, to change the offset format to
> hexidecimal. Note, this can only be used with the -g, --call-graph option
> when address is specified as a sort_key. When this option is not
> specified, the default behavior is used, which is to display the offset
> in decimal form.
>
> Before:
>
> $ ./perf report -n --fields=overhead,sample,period,pid,symbol --stdio \
> > -g graph,callee,address --percent-limit 10 --no-children
> [ ... ]
> 10.87% 1 506940477 15:migration/3 [k] _spin_trylock
> |
> ---_spin_trylock +13
> double_lock_balance +48
> schedule +2007
> migration_thread +613
> kthread +158
> child_rip +10
Humm, trying to reproduce this with the command line you provided above,
in your 'Before:' case:
# perf record -F 10000 -g --pid 10,15,21,27
# perf report -n --fields=overhead,sample,period,pid,symbol --stdio -g graph,callee,address --no-children
7.31% 1 205216 10:migration/0 [k] _raw_spin_lock
|
---_raw_spin_lock atomic.h:187
scheduler_tick sched.h:935
update_process_times timer.c:1637
tick_sched_handle tick-sched.c:163
tick_sched_timer tick-sched.c:1171
__hrtimer_run_queues hrtimer.c:1211
hrtimer_interrupt hrtimer.c:469
smp_apic_timer_interrupt apic.c:1025
apic_timer_interrupt entry_64.S:795
finish_task_switch current.h:15
__sched_text_start core.c:2802
schedule current.h:15
smpboot_thread_fn smpboot.c:160
kthread kthread.c:238
ret_from_fork entry_64.S:447
[root@jouet ~]# perf --version
perf version 4.15.rc4.g6c7919a
This is with is in my perf/core branch, but I bet you'll get the same
result with 4.15-rc4.
Humm, so this is because you're not using a vmlinux file, then the
srcline gets resolved and get_srcline() returns before getting to the
fallback where offsets are shown, if I do:
[root@jouet ~]# mv /lib/modules/4.15.0-rc3+/build/vmlinux /lib/modules/4.15.0-rc3+/build/vmlinux.OFF
and remove the cached entry in my ~/.debug/ build-id cache, I get to:
# symbol: _raw_spin_lock
#
# Total Lost Samples: 0
#
# Samples: 54 of event 'cycles:ppp'
# Event count (approx.): 2805651
#
# Overhead Samples Period Pid:Command Source:Line
# ........ ............ ............ ................... ...................................
#
7.31% 1 205216 10:migration/0 _raw_spin_lock+18446603336221184012
|
---_raw_spin_lock +18446603336221184012
scheduler_tick +18446603336221184060
update_process_times +18446603336221184064
tick_sched_handle +18446603336221184034
tick_sched_timer +18446603336221184052
__hrtimer_run_queues +18446603336221184214
hrtimer_interrupt +18446603336221184166
smp_apic_timer_interrupt +18446603336221184086
__irqentry_text_start +18446603336221184150
finish_task_switch +18446603336221184123
__sched_text_start +18446603336221184577
schedule +18446603336221184040
smpboot_thread_fn +18446603336221184323
kthread +18446603336221184273
ret_from_fork +18446603336221184031
#
[root@jouet ~]#
Which looks buggy... bisecting...
- Arnaldo