Re: [RFC] perf record: missing buildid for callstack modules

From: Adrian Hunter
Date: Mon Jan 11 2016 - 06:58:26 EST


On 11/01/16 13:02, Namhyung Kim wrote:
> Hi Adrian,
>
> On Mon, Jan 11, 2016 at 11:27:56AM +0200, Adrian Hunter wrote:
>> On 09/01/16 12:31, Namhyung Kim wrote:
>>> Hi Stephane,
>>>
>>> On Fri, Jan 08, 2016 at 10:01:24AM -0800, Stephane Eranian wrote:
>>>> On Thu, Jan 7, 2016 at 3:47 PM, Arnaldo Carvalho de Melo
>>>> <acme@xxxxxxxxxx> wrote:
>>>>> Em Fri, Jan 08, 2016 at 07:47:03AM +0900, Namhyung Kim escreveu:
>>>>>> On January 8, 2016 7:00:35 AM GMT+09:00, Stephane Eranian <eranian@xxxxxxxxxx> wrote:
>>>>>>> On Thu, Jan 7, 2016 at 1:59 PM, Arnaldo Carvalho de Melo
>>>>>>> <acme@xxxxxxxxxx> wrote:
>>>>>>>> Em Thu, Jan 07, 2016 at 01:56:14PM -0800, Stephane Eranian escreveu:
>>>>>>>>> Hi,
>>>>>>>>>
>>>>>>>>> Whenever you do:
>>>>>>>>>
>>>>>>>>> $ perf record -g -a sleep 10
>>>>>>>>>
>>>>>>>>> Perf will collect the callstack for each sample. At the end of the
>>>>>>>>> run, perf record
>>>>>>>>> adds the buildid for all dso with at least one sample. But when it
>>>>>>> does this, it
>>>>>>>>> only looks at the sampled IP and ignore the modules traversed by the
>>>>>>> callstack.
>>>>>>>>> That means that, it is not possible to uniquely identify the modules
>>>>>>> executed,
>>>>>>>>> unless they had at least one IP sample captured. But this is not
>>>>>>>>> always the case.
>>>>>>>>>
>>>>>>>>> How about providing an option to perf record to force collecting
>>>>>>>>> buildid for all IPs
>>>>>>>>> captured in the callstack? I understand that would cost more at the
>>>>>>> end of the
>>>>>>>>> collection, but this would be beneficial to several monitoring
>>>>>>> scenarios.
>>>>>>>>
>>>>>>>> I agree, would consider applying a patch that provides the option but
>>>>>>>> does not do this by default.
>>>>>>>>
>>>>>>> I agree, not the default.
>>>>>>
>>>>>> Hi Stephane,
>>>>>>
>>>>>> Please see
>>>>>>
>>>>>> https://lkml.org/lkml/2015/3/22/249
>>>>>
>>>>>
>>>>> Oops, Stephane, please try this, so that we can finally merge it :-\
>>>>>
>>>> I will try it today. However, I am a bit worried about the performance
>>>> impact. Unless I am missing something in this approach we may end up
>>>> looking up N times the same module if it appears in N callstacks. In
>>>> Andi's suggested approach, there would be only one pass at the beginning
>>>> (or the end of the run). But you could miss some modules if they are gone
>>>> by the time you run the pass.
>>>
>>> How about this then?
>>>
>>> Adrian, is it ok to skip process_buildids() for the auxtrace?
>>
>> If you don't post-process (i.e. call process_buildids), then where do the
>> DSOs come from? i.e. dsos__hit_all() just hits the DSOs that exist.
>
> Ah, right. I somehow thought that it was processed already elsewhere.
>
> Then, how about this?

Looks OK for auxtrace.

In other respects you should probably clarify what happens when --no-buildid
and --buildid-all are used together. Maybe it should be an error.

>
>
>
>>From 38b2bc273329afe4334a11d87d20ef71639132fb Mon Sep 17 00:00:00 2001
> From: Namhyung Kim <namhyung@xxxxxxxxxx>
> Date: Sat, 9 Jan 2016 22:37:55 +0900
> Subject: [PATCH] perf record: Add --buildid-all option
>
> The --buildid-all option is to record build-id of all DSOs in the file.
> It might be very costly to postprocess samples to find which DSO hits.
>
> Signed-off-by: Namhyung Kim <namhyung@xxxxxxxxxx>
> ---
> tools/perf/Documentation/perf-record.txt | 3 +++
> tools/perf/builtin-record.c | 26 ++++++++++++++++++++------
> 2 files changed, 23 insertions(+), 6 deletions(-)
>
> diff --git a/tools/perf/Documentation/perf-record.txt b/tools/perf/Documentation/perf-record.txt
> index 3a1a32f5479f..fbceb631387c 100644
> --- a/tools/perf/Documentation/perf-record.txt
> +++ b/tools/perf/Documentation/perf-record.txt
> @@ -338,6 +338,9 @@ Options passed to clang when compiling BPF scriptlets.
> Specify vmlinux path which has debuginfo.
> (enabled when BPF prologue is on)
>
> +--buildid-all::
> +Record build-id of all DSOs regardless whether it's actually hit or not.
> +
> SEE ALSO
> --------
> linkperf:perf-stat[1], linkperf:perf-list[1]
> diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
> index dc4e0adf5c5b..319712a4e02b 100644
> --- a/tools/perf/builtin-record.c
> +++ b/tools/perf/builtin-record.c
> @@ -50,6 +50,7 @@ struct record {
> int realtime_prio;
> bool no_buildid;
> bool no_buildid_cache;
> + bool buildid_all;
> unsigned long long samples;
> };
>
> @@ -362,6 +363,13 @@ static int process_buildids(struct record *rec)
> */
> symbol_conf.ignore_vmlinux_buildid = true;
>
> + /*
> + * If --buildid-all is given, it marks all DSO regardless of hits,
> + * so no need to process samples.
> + */
> + if (rec->buildid_all)
> + rec->tool.sample = NULL;
> +
> return perf_session__process_events(session);
> }
>
> @@ -756,12 +764,8 @@ out_child:
>
> if (!rec->no_buildid) {
> process_buildids(rec);
> - /*
> - * We take all buildids when the file contains
> - * AUX area tracing data because we do not decode the
> - * trace because it would take too long.
> - */
> - if (rec->opts.full_auxtrace)
> +
> + if (rec->buildid_all)
> dsos__hit_all(rec->session);
> }
> perf_session__write_header(rec->session, rec->evlist, fd, true);
> @@ -1138,6 +1142,8 @@ struct option __record_options[] = {
> "options passed to clang when compiling BPF scriptlets"),
> OPT_STRING(0, "vmlinux", &symbol_conf.vmlinux_name,
> "file", "vmlinux pathname"),
> + OPT_BOOLEAN(0, "buildid-all", &record.buildid_all,
> + "Record build-id of all DSOs regardless of hits"),
> OPT_END()
> };
>
> @@ -1255,6 +1261,14 @@ int cmd_record(int argc, const char **argv, const char *prefix __maybe_unused)
> if (err)
> goto out_symbol_exit;
>
> + /*
> + * We take all buildids when the file contains
> + * AUX area tracing data because we do not decode the
> + * trace because it would take too long.
> + */
> + if (rec->opts.full_auxtrace)
> + rec->buildid_all = true;
> +
> if (record_opts__config(&rec->opts)) {
> err = -EINVAL;
> goto out_symbol_exit;
>