Re: [PATCH RFC V9 2/3] perf,tools: per-event callgraph support

From: Arnaldo Carvalho de Melo
Date: Fri Aug 07 2015 - 11:49:48 EST


Em Fri, Aug 07, 2015 at 12:38:43PM -0300, Arnaldo Carvalho de Melo escreveu:
> Em Thu, Aug 06, 2015 at 03:44:53PM -0400, kan.liang@xxxxxxxxx escreveu:
> > From: Kan Liang <kan.liang@xxxxxxxxx>
> > Here are some examples and test results.
> >
> > 1. Comparing the elapsed time and perf.data size from "kernbench -M -H".
> >
> > The test command for FULL callgraph and time support.
> > "perf record -e
> > '{cpu/cpu-cycles,period=100000/,cpu/instructions,period=20000/p}'
> > --call-graph fp --time"

Some other oddity I noticed while testing:

For:

[root@zoo ~]# perf report --header-only
# ========
# captured on: Fri Aug 7 12:39:13 2015
# hostname : zoo
# os release : 4.2.0-rc5+
# perf version : 4.2.rc4.g1cd951
# arch : x86_64
# nrcpus online : 4
# nrcpus avail : 4
# cpudesc : Intel(R) Core(TM) i7-3667U CPU @ 2.00GHz
# cpuid : GenuineIntel,6,58,9
# total memory : 8080692 kB
# cmdline : /home/acme/bin/perf record -e {cpu/cpu-cycles,call-graph=fp,time,period=10000/pp,cpu/instructions,call-graph=no,time=0,period=20000/p} -a
# event : name = cpu/cpu-cycles,call-graph=fp,time,period=10000/pp, , id = { 38537, 38538, 38539, 38540 }, type = 4, size = 112, config = 0x3c, { sample_perio
# event : name = cpu/instructions,call-graph=no,time=0,period=20000/p, , id = { 38541, 38542, 38543, 38544 }, type = 4, size = 112, config = 0xc0, { sample_pe
# HEADER_CPU_TOPOLOGY info available, use -I to display
# HEADER_NUMA_TOPOLOGY info available, use -I to display
# pmu mappings: cpu = 4, software = 1, power = 7, uncore_imc = 8, tracepoint = 2, uncore_cbox_0 = 9, uncore_cbox_1 = 10, breakpoint = 5
# group: {cpu/cpu-cycles,call-graph=fp,time,period=10000/pp,cpu/instructions,call-graph=no,time=0,period=20000/p}
# ========
#

I get:

Samples: 2K of event 'cpu/instructions,call-graph=no,time=0,period=20000/p', Event count (approx.): 46956518
Children Self Command Shared Object Symbol â
- 67.56% 0.00% qemu-system-x86 [unknown] [.] 0xad5e258d4c544155 â
0xad5e258d4c544155 â
- 67.56% 0.00% qemu-system-x86 libc-2.20.so [.] __libc_start_main â
__libc_start_main â
0xad5e258d4c544155 â
- 67.56% 0.00% qemu-system-x86 perf [.] main â
main â
__libc_start_main â
0xad5e258d4c544155 â
- 67.56% 0.00% qemu-system-x86 perf [.] run_builtin â
run_builtin â
main â
__libc_start_main â

This is in the 'perf report' TUI, why, for an event with 'callgraph=no', we
get callchains? How come?

If I try it with the --stdio, well, its there as well:

# Samples: 2K of event 'cpu/instructions,call-graph=no,time=0,period=20000/p'
# Event count (approx.): 46956518
#
# Children Self Command Shared Object Symbol
# ........ ........ ............... .......................... ..............................................
#
67.56% 0.00% qemu-system-x86 [unknown] [.] 0xad5e258d4c544155
|
---0xad5e258d4c544155

67.56% 0.00% qemu-system-x86 libc-2.20.so [.] __libc_start_main
|
---__libc_start_main
0xad5e258d4c544155

67.56% 0.00% qemu-system-x86 perf [.] main
|
---main
__libc_start_main
0xad5e258d4c544155

67.56% 0.00% qemu-system-x86 perf [.] run_builtin
|
---run_builtin
main
__libc_start_main
0xad5e258d4c544155

Can you take a look? What I have, i.e. the patches from Jiri and Kan
moving stuff out of the way of the python binding and this patch is at
the tmp.perf/core branch in my git tree:

git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git

- Arnaldo


> Jiri, while testing this I noticed that the message for EINVAL when
> using the cpu// syntax (per-event settings) is cryptic:
>
> [root@zoo ~]# perf record -e 'cpu/cpu-cycles,call-graph=fp,time,period=100000/p' ls
> Error:
> The sys_perf_event_open() syscall returned with 22 (Invalid argument) for event (cpu/cpu-cycles,call-graph=fp,time,period=100000/p).
> /bin/dmesg may provide additional information.
> No CONFIG_PERF_EVENTS=y kernel support configured?
>
> Whereas if we use -F, it is much, much clearer, telling the user exactly
> what is failing and what needs to be done to make it work:
>
> [root@zoo ~]# perf record -F 100000 -e cpu/cpu-cycles/ usleep 1
> Maximum frequency rate (25000) reached.
> Please use -F freq option with lower value or consider
> tweaking /proc/sys/kernel/perf_event_max_sample_rate.
> [root@zoo ~]#
>
> Hope this is something easy to wire up, given your event parsing kung foo
> skillz...
>
> ;-)
>
> - Arnaldo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/