Re: [GIT PULL 00/19] perf/core improvements
From: Ingo Molnar
Date: Wed Apr 13 2016 - 03:03:39 EST
* Arnaldo Carvalho de Melo <acme@xxxxxxxxxx> wrote:
> Hi Ingo,
>
> Please consider pulling, tested with 'perf test', 'make -C tools/perf
> build-test' and building on these userspaces, using docker:
>
> # dm
> alldeps-fedora-rawhide-minus-python-dev: Ok
> alldeps-fedora-20: Ok
> alldeps-ubuntu-12.04: Ok
> minimal-debian-experimental-x-mips64: Ok
> minimal-debian-experimental-x-mips64el: Ok
> minimal-debian-experimental-x-mipsel: Ok
> minimal-ubuntu-x-arm: Ok
> minimal-ubuntu-x-arm64: Ok
> minimal-ubuntu-x-ppc64: Ok
> minimal-ubuntu-x-ppc64el: Ok
> alldeps-debian: Ok
> alldeps-mageia: Ok
> alldeps-rhel7: Ok
> alldeps-centos: Ok
> alldeps-opensuse: Ok
> alldeps-ubuntu: Ok
> #
>
> This is on top of my previous pull request, that is not yet
> merged: perf-core-for-mingo-20160408.
>
> Best regards,
>
> - Arnaldo
>
> The following changes since commit 99e87f7bb7268cf644add87130590966fd5d0d17:
>
> perf symbols: Adjust symbol for shared objects (2016-04-08 09:58:15 -0300)
>
> are available in the git repository at:
>
> git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git tags/perf-core-for-mingo-20160411
>
> for you to fetch changes up to 00768a2bd3245eace0690fcf2c02776a256b66d7:
>
> perf trace: Print unresolved symbol names as addresses (2016-04-11 22:18:25 -0300)
>
> ----------------------------------------------------------------
> perf/core improvements:
>
> - Automagically create a 'bpf-output' event, easing the setup of BPF
> C "scripts" that produce output via the perf ring buffer. Now it is
> just a matter of calling any perf tool, such as 'trace', with a C
> source file that references the __bpf_stdout__ output channel and
> that channel will be created and connected to the script:
>
> # trace -e nanosleep --event test_bpf_stdout.c usleep 1
> 0.013 ( 0.013 ms): usleep/2818 nanosleep(rqtp: 0x7ffcead45f40 ) ...
> 0.013 ( ): __bpf_stdout__:Raise a BPF event!..)
> 0.015 ( ): perf_bpf_probe:func_begin:(ffffffff81112460))
> 0.261 ( ): __bpf_stdout__:Raise a BPF event!..)
> 0.262 ( ): perf_bpf_probe:func_end:(ffffffff81112460 <- ffffffff81003d92))
> 0.264 ( 0.264 ms): usleep/2818 ... [continued]: nanosleep()) = 0
> #
>
> Further work is needed to reduce the number of lines in a perf bpf C source
> file, this being the part where we greatly reduce the command line setup (Wang Nan)
>
> - 'perf trace' now supports callchains, with 'trace --call-graph dwarf' using
> libunwind, just like 'perf top', to ask the kernel for stack dumps for CFI
> processing. This reduces the overhead by asking just for userspace callchains
> and also only for the syscall exit tracepoint (raw_syscalls:sys_exit)
> (Milian Wolff, Arnaldo Carvalho de Melo)
>
> Try it with, for instance:
>
> # perf trace --call dwarf ping 127.0.0.1
>
> An excerpt of a system wide 'perf trace --call dwarf" session is at:
>
> https://fedorapeople.org/~acme/perf/perf-trace--call-graph-dwarf--all-cpus.txt
>
> You may need to bump the number of mmap pages, using -m/--mmap-pages,
> but on a Broadwell machine the defaults allowed system wide tracing to
> work without losing that many records, experiment with just some
> syscalls, like:
>
> # perf trace --call dwarf -e nanosleep,futex
>
> All the targets available for 'perf record', 'perf top' (--pid, --tid, --cpu,
> etc) should work. Also --duration may be interesting to try.
>
> To get filenames from in various syscalls pointer args (open, ettc), add this
> to the mix:
>
> # perf probe 'vfs_getname=getname_flags:72 pathname=filename:string'
>
> Making this work is next in line:
>
> # trace --call dwarf --ev sched:sched_switch/call-graph=fp/ usleep 1
>
> I.e. honouring per-tracepoint callchains in 'perf trace' in addition to
> in raw_syscalls:sys_exit.
>
> Signed-off-by: Arnaldo Carvalho de Melo <acme@xxxxxxxxxx>
>
> ----------------------------------------------------------------
> Arnaldo Carvalho de Melo (15):
> perf script: Use readdir() instead of deprecated readdir_r()
> perf thread_map: Use readdir() instead of deprecated readdir_r()
> perf tools: Use readdir() instead of deprecated readdir_r()
> perf tools: Use readdir() instead of deprecated readdir_r()
> perf dwarf: Guard !x86_64 definitions under #ifdef else clause
> perf evsel: Allow passing a left alignment when printing a symbol
> perf evsel: Rename print_ip() to fprintf_sym()
> perf evsel: Introduce fprintf_callchain() method out of fprintf_sym()
> perf trace: Exclude the kernel part of the callchain leading to a syscall
> perf evsel: Do not use globals in config()
> perf evlist: Add (reset,set)_sample_bit methods
> perf evsel: Rename config_callgraph() to config_callchain() and make it public
> perf trace: Make "--call-graph" affect just "raw_syscalls:sys_exit"
> perf evsel: Allow unresolved symbol names to be printed as addresses
> perf trace: Print unresolved symbol names as addresses
>
> Milian Wolff (2):
> perf evsel: Allow specifying a file to output in perf_evsel__print_ip
> perf trace: Add support for printing call chains on sys_exit events.
>
> Wang Nan (2):
> perf bpf: Clone bpf stdout events in multiple bpf scripts
> perf bpf: Automatically create bpf-output event __bpf_stdout__
>
> tools/perf/Documentation/perf-trace.txt | 9 ++
> tools/perf/arch/x86/tests/perf-time-to-tsc.c | 2 +-
> tools/perf/arch/x86/util/dwarf-regs.c | 8 +-
> tools/perf/builtin-kvm.c | 2 +-
> tools/perf/builtin-record.c | 10 +-
> tools/perf/builtin-script.c | 78 +++++++--------
> tools/perf/builtin-top.c | 2 +-
> tools/perf/builtin-trace.c | 65 +++++++++++-
> tools/perf/tests/bpf.c | 2 +-
> tools/perf/tests/code-reading.c | 2 +-
> tools/perf/tests/keep-tracking.c | 2 +-
> tools/perf/tests/openat-syscall-tp-fields.c | 2 +-
> tools/perf/tests/perf-record.c | 2 +-
> tools/perf/tests/switch-tracking.c | 2 +-
> tools/perf/util/bpf-loader.c | 143 +++++++++++++++++++++++++++
> tools/perf/util/bpf-loader.h | 19 ++++
> tools/perf/util/event.c | 12 +--
> tools/perf/util/evlist.c | 18 ++++
> tools/perf/util/evlist.h | 16 ++-
> tools/perf/util/evsel.c | 16 +--
> tools/perf/util/evsel.h | 14 ++-
> tools/perf/util/parse-events.c | 60 +++++------
> tools/perf/util/record.c | 5 +-
> tools/perf/util/session.c | 95 ++++++++++++------
> tools/perf/util/session.h | 8 +-
> tools/perf/util/symbol.c | 25 ++++-
> tools/perf/util/symbol.h | 6 ++
> tools/perf/util/thread_map.c | 8 +-
> 28 files changed, 487 insertions(+), 146 deletions(-)
Pulled, thanks a lot Arnaldo!
Ingo