Re: [PATCH v6 3/5] tools, perf, script: Add --call-trace and --call-ret-trace

From: leo . yan
Date: Fri Sep 28 2018 - 06:23:43 EST


Hi Andi,

On Thu, Sep 20, 2018 at 11:05:38AM -0700, Andi Kleen wrote:
> From: Andi Kleen <ak@xxxxxxxxxxxxxxx>
>
> Add short cut options to print PT call trace and call-ret-trace,
> for calls and call and returns. Roughly corresponds to ftrace
> function tracer and function graph tracer.
>
> Just makes these common use cases nicer to use.
>
> % perf record -a -e intel_pt// sleep 1
> % perf script --call-trace
> perf 900 [000] 194167.205652203: ([kernel.kallsyms]) perf_pmu_enable
> perf 900 [000] 194167.205652203: ([kernel.kallsyms]) __x86_indirect_thunk_rax
> perf 900 [000] 194167.205652203: ([kernel.kallsyms]) event_filter_match
> perf 900 [000] 194167.205652203: ([kernel.kallsyms]) group_sched_in
> perf 900 [000] 194167.205652203: ([kernel.kallsyms]) __x86_indirect_thunk_rax
> perf 900 [000] 194167.205652203: ([kernel.kallsyms]) event_sched_in.isra.107
> perf 900 [000] 194167.205652203: ([kernel.kallsyms]) perf_event_set_state.part.71
> perf 900 [000] 194167.205652203: ([kernel.kallsyms]) perf_event_update_time
> perf 900 [000] 194167.205652203: ([kernel.kallsyms]) perf_pmu_disable
> perf 900 [000] 194167.205652203: ([kernel.kallsyms]) perf_log_itrace_start
> perf 900 [000] 194167.205652203: ([kernel.kallsyms]) __x86_indirect_thunk_rax
> perf 900 [000] 194167.205652203: ([kernel.kallsyms]) perf_event_update_userpage
>
> % perf script --call-ret-trace
> perf 900 [000] 194167.205652203: tr strt ([unknown]) pt_config
> perf 900 [000] 194167.205652203: return ([kernel.kallsyms]) pt_config
> perf 900 [000] 194167.205652203: return ([kernel.kallsyms]) pt_event_add
> perf 900 [000] 194167.205652203: call ([kernel.kallsyms]) perf_pmu_enable
> perf 900 [000] 194167.205652203: return ([kernel.kallsyms]) perf_pmu_nop_void
> perf 900 [000] 194167.205652203: return ([kernel.kallsyms]) event_sched_in.isra.107
> perf 900 [000] 194167.205652203: call ([kernel.kallsyms]) __x86_indirect_thunk_rax
> perf 900 [000] 194167.205652203: return ([kernel.kallsyms]) perf_pmu_nop_int
> perf 900 [000] 194167.205652203: return ([kernel.kallsyms]) group_sched_in
> perf 900 [000] 194167.205652203: call ([kernel.kallsyms]) event_filter_match
> perf 900 [000] 194167.205652203: return ([kernel.kallsyms]) event_filter_match
> perf 900 [000] 194167.205652203: call ([kernel.kallsyms]) group_sched_in
> perf 900 [000] 194167.205652203: call ([kernel.kallsyms]) __x86_indirect_thunk_rax
> perf 900 [000] 194167.205652203: return ([kernel.kallsyms]) perf_pmu_nop_txn
> perf 900 [000] 194167.205652203: call ([kernel.kallsyms]) event_sched_in.isra.107
> perf 900 [000] 194167.205652203: call ([kernel.kallsyms]) perf_event_set_state.part.71
>
> Signed-off-by: Andi Kleen <ak@xxxxxxxxxxxxxxx>
> ---
> v2: Print errors, power, ptwrite too
> ---
> tools/perf/Documentation/perf-script.txt | 7 +++++++
> tools/perf/builtin-script.c | 24 ++++++++++++++++++++++++
> 2 files changed, 31 insertions(+)
>
> diff --git a/tools/perf/Documentation/perf-script.txt b/tools/perf/Documentation/perf-script.txt
> index 00c655ab4968..805baabd238e 100644
> --- a/tools/perf/Documentation/perf-script.txt
> +++ b/tools/perf/Documentation/perf-script.txt
> @@ -390,6 +390,13 @@ include::itrace.txt[]
> --xed::
> Run xed disassembler on output. Requires installing the xed disassembler.
>
> +--call-trace::
> + Show call stream for intel_pt traces. The CPUs are interleaved, but
> + can be filtered with -C.
> +
> +--call-ret-trace::
> + Show call and return stream for intel_pt traces.

Seems to me, these two features are _NOT_ only benefit for intel_pt,
other hardware tracing (e.g. Arm CoreSight) can enable these features
as well. This patch is to document only for intel_pt, later if we
enable this feature on Arm platform we need to change the doc;
alternatively we can use more general description for these two options
at the first place. How about you think for this?

Except this question, this patch looks good for me.

Thanks,
Leo Yan

> SEE ALSO
> --------
> linkperf:perf-record[1], linkperf:perf-script-perl[1],
> diff --git a/tools/perf/builtin-script.c b/tools/perf/builtin-script.c
> index 519ebb5a1f96..6c4562973983 100644
> --- a/tools/perf/builtin-script.c
> +++ b/tools/perf/builtin-script.c
> @@ -3119,6 +3119,26 @@ static int parse_xed(const struct option *opt __maybe_unused,
> return 0;
> }
>
> +static int parse_call_trace(const struct option *opt __maybe_unused,
> + const char *str __maybe_unused,
> + int unset __maybe_unused)
> +{
> + parse_output_fields(NULL, "-ip,-addr,-event,-period,+callindent", 0);
> + itrace_parse_synth_opts(opt, "cewp", 0);
> + nanosecs = true;
> + return 0;
> +}
> +
> +static int parse_callret_trace(const struct option *opt __maybe_unused,
> + const char *str __maybe_unused,
> + int unset __maybe_unused)
> +{
> + parse_output_fields(NULL, "-ip,-addr,-event,-period,+callindent,+flags", 0);
> + itrace_parse_synth_opts(opt, "crewp", 0);
> + nanosecs = true;
> + return 0;
> +}
> +
> int cmd_script(int argc, const char **argv)
> {
> bool show_full_info = false;
> @@ -3210,6 +3230,10 @@ int cmd_script(int argc, const char **argv)
> "Decode instructions from itrace", parse_insn_trace),
> OPT_CALLBACK_OPTARG(0, "xed", NULL, NULL, NULL,
> "Run xed disassembler on output", parse_xed),
> + OPT_CALLBACK_OPTARG(0, "call-trace", &itrace_synth_opts, NULL, NULL,
> + "Decode calls from from itrace", parse_call_trace),
> + OPT_CALLBACK_OPTARG(0, "call-ret-trace", &itrace_synth_opts, NULL, NULL,
> + "Decode calls and returns from itrace", parse_callret_trace),
> OPT_STRING(0, "stop-bt", &symbol_conf.bt_stop_list_str, "symbol[,symbol...]",
> "Stop display of callgraph at these symbols"),
> OPT_STRING('C', "cpu", &cpu_list, "cpu", "list of cpus to profile"),