v4: Remove the options "--inline-line" and "--inline-name". Just use
a new option "--inline" to print the inline function information.
The policy is if the inline function name can be resolved then
print the name in priority. If the name can't be resolved, then
print the source line number.
For example:
perf report --stdio --inline
0.69% 0.00% inline ld-2.23.so [.] dl_main
|
---dl_main
|
--0.56%--_dl_relocate_object
|
---_dl_relocate_object (inline)
elf_dynamic_do_Rela (inline)
Following 3 patches are updated according to this change.
perf report: Show inline stack in browser mode
perf report: Show inline stack in stdio mode
perf report: Create new inline option
Followings are not changed.
perf report: Find the inline stack for a given address
perf report: Refactor common code in srcline.c
v3: Iterate on RIPs of all callchain entries to check if the RIP is in
inline functions.
Reverse the order of the inliner printout if necessary.
Provide new options "--inline-line" / "--inline-name" to print
inline function name or print inline function source line.
v2: Thanks so much for Arnaldo's comments!
The modifications are:
1. Divide v1 patch "perf report: Find the inline stack for a
given address" into 2 patches:
a. perf report: Refactor common code in srcline.c
b. perf report: Find the inline stack for a given address
Some function names are changed:
dso_name_get -> dso__name
ilist_apend -> inline_list__append
get_inline_node -> dso__parse_addr_inlines
free_inline_node -> inline_node__delete
2. Since the function name are changed, update following patches
accordingly.
a. perf report: Show inline stack in stdio mode
b. perf report: Show inline stack in browser mode
3. Rebase to latest perf/core branch. This patch is impacted.
a. perf report: Create a new option "--inline"
v1: Initial post
It would be useful for perf to support a mode to query the
inline stack for callgraph addresses. This would simplify
finding the right code in code that does a lot of inlining.
For example, the c code:
static inline void f3(void)
{
int i;
for (i = 0; i < 1000;) {
if(i%2)
i++;
else
i++;
}
printf("hello f3\n"); /* D */
}
/* < CALLCHAIN: f2 <- f1 > */
static inline void f2(void)
{
int i;
for (i = 0; i < 100; i++) {
f3(); /* C */
}
}
/* < CALLCHAIN: f1 <- main > */
static inline void f1(void)
{
int i;
for (i = 0; i < 100; i++) {
f2(); /* B */
}
}
/* < CALLCHAIN: main <- TOP > */
int main()
{
struct timeval tv;
time_t start, end;
gettimeofday(&tv, NULL);
start = end = tv.tv_sec;
while((end - start) < 5) {
f1(); /* A */
gettimeofday(&tv, NULL);
end = tv.tv_sec;
}
return 0;
}
The printed inline stack is:
0.05% test2 test2 [.] main
|
---/home/perf-dev/lck-2867/test/test2.c:27 (inline)
/home/perf-dev/lck-2867/test/test2.c:35 (inline)
/home/perf-dev/lck-2867/test/test2.c:45 (inline)
/home/perf-dev/lck-2867/test/test2.c:61 (inline)
I tag A/B/C/D in above c code to indicate the source line,
actually the inline stack is equal to:
0.05% test2 test2 [.] main
|
---D
C
B
A
Jin Yao (5):
perf report: Refactor common code in srcline.c
perf report: Find the inline stack for a given address
perf report: Create new inline option
perf report: Show inline stack in stdio mode
perf report: Show inline stack in browser mode
tools/perf/Documentation/perf-report.txt | 4 +
tools/perf/builtin-report.c | 2 +
tools/perf/ui/browsers/hists.c | 168 ++++++++++++++++++++--
tools/perf/ui/stdio/hist.c | 76 +++++++++-
tools/perf/util/hist.c | 5 +
tools/perf/util/sort.h | 1 +
tools/perf/util/srcline.c | 237 +++++++++++++++++++++++++++----
tools/perf/util/symbol-elf.c | 5 +
tools/perf/util/symbol.h | 5 +-
tools/perf/util/util.h | 16 +++
10 files changed, 481 insertions(+), 38 deletions(-)