Re: [PATCH 15/25] perf report: Make max_stack value allow for synthesized callchains

From: Adrian Hunter
Date: Tue Sep 29 2015 - 04:55:39 EST


On 28/09/15 23:03, Arnaldo Carvalho de Melo wrote:
> Em Fri, Sep 25, 2015 at 04:15:46PM +0300, Adrian Hunter escreveu:
>> perf report has an option (--max-stack) to set the maximum stack depth
>> when processing callchains. The option defaults to the hard-coded
>> maximum definition PERF_MAX_STACK_DEPTH which is 127. The intention of
>> the option is to allow the user to reduce the processing time by
>> reducing the amount of the callchain that is processed.
>>
>> It is also possible, when processing instruction traces, to synthesize
>> callchains. Synthesized callchains do not have the kernel size
>> limitation and are whatever size the user requests, although validation
>> presently prevents the user requested a value greater that 1024. The
>> default value is 16.
>
> So, haven't checked the options, but one can possibly use both the way
> itrace has to ask for a max stack size and also via --max-stack, right?

Possibly, but it would not be a common paradigm.

>
> In that case we better emit a warning or plain state that one either
> uses one way of setting the max stack or the other?

max_stack was added as an optimization to reduce processing time, so
people specifying --max-stack might get a increased processing time
if combined with synthesized callchains, but otherwise no real harm.

A warning seems like overkill. Could amend the documenation e.g.


diff --git a/tools/perf/Documentation/perf-report.txt b/tools/perf/Documentation/perf-report.txt
index b941d5e07e28..ce499035e6d8 100644
--- a/tools/perf/Documentation/perf-report.txt
+++ b/tools/perf/Documentation/perf-report.txt
@@ -205,6 +205,8 @@ OPTIONS
beyond the specified depth will be ignored. This is a trade-off
between information loss and faster processing especially for
workloads that can have a very long callchain stack.
+ Note that when using the --itrace option the synthesized callchain size
+ will override this value if the synthesized callchain size is bigger.

Default: 127




>
> I'm applying the patch, because it is unlikely that this gets specified,
> but would be good to close this gap.
>
> - Arnaldo
>
>> To allow for synthesized callchains, make the max_stack value at least
>> the same size as the synthesized callchain size.
>>
>> Signed-off-by: Adrian Hunter <adrian.hunter@xxxxxxxxx>
>> ---
>> tools/perf/builtin-report.c | 4 ++++
>> 1 file changed, 4 insertions(+)
>>
>> diff --git a/tools/perf/builtin-report.c b/tools/perf/builtin-report.c
>> index e94e5c7155af..37c9f5125887 100644
>> --- a/tools/perf/builtin-report.c
>> +++ b/tools/perf/builtin-report.c
>> @@ -809,6 +809,10 @@ int cmd_report(int argc, const char **argv, const char *prefix __maybe_unused)
>> if (report.inverted_callchain)
>> callchain_param.order = ORDER_CALLER;
>>
>> + if (itrace_synth_opts.callchain &&
>> + (int)itrace_synth_opts.callchain_sz > report.max_stack)
>> + report.max_stack = itrace_synth_opts.callchain_sz;
>> +
>> if (!input_name || !strlen(input_name)) {
>> if (!fstat(STDIN_FILENO, &st) && S_ISFIFO(st.st_mode))
>> input_name = "-";
>> --
>> 1.9.1
>

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/