Re: [PATCH v3] perf tools: Document --children option in more detail

From: Arnaldo Carvalho de Melo
Date: Tue Apr 21 2015 - 11:43:08 EST


Em Tue, Apr 21, 2015 at 09:28:31PM +0900, Namhyung Kim escreveu:
> As the --children option changes the output of perf report (and perf
> top) it sometimes confuses users. Add more words and examples to help
> understanding of the option's behavior - and how to disable it ;-).
>
> Reviewed-by: Ingo Molnar <mingo@xxxxxxxxxx>
> Cc: Taeung Song <treeze.taeung@xxxxxxxxx>
> Signed-off-by: Namhyung Kim <namhyung@xxxxxxxxxx>
> ---
> tools/perf/Documentation/overhead-calculation.txt | 97 +++++++++++++++++++++++
> tools/perf/Documentation/perf-report.txt | 4 +
> tools/perf/Documentation/perf-top.txt | 3 +-
> 3 files changed, 103 insertions(+), 1 deletion(-)
> create mode 100644 tools/perf/Documentation/overhead-calculation.txt
>
> diff --git a/tools/perf/Documentation/overhead-calculation.txt b/tools/perf/Documentation/overhead-calculation.txt
> new file mode 100644
> index 000000000000..d7095f77154c
> --- /dev/null
> +++ b/tools/perf/Documentation/overhead-calculation.txt

I think Ingo suggested that you renamed this file to include the word
"callchain" in it, no? looking at "overhead-calculation" I feel like I
first have to open it to figure out what kind of overhead is this,
perhaps it would be better named:

tools/perf/Documentation/callchain-overhead.txt

?

- Arnaldo

> @@ -0,0 +1,97 @@
> +Overhead calculation
> +--------------------
> +The overhead can be shown in two columns as 'Children' and 'Self' when
> +perf collects callchains. The self overhead is simply calculated by
> +adding all period values of the entry - usually a function (symbol).
> +This is the value that perf shows traditionally and sum of all the
> +self overhead values should be 100%.

For consistency sake, please use always 'self' instead of it without
quotes, like you do in the next paragraph for 'children'.

> +The 'children' overhead is calculated by adding all period values of
> +the child functions so that it can show the total overhead of the
> +higher level functions even if they don't directly execute much.
> +'Children' here means functions that are called from another (parent)
> +function.
> +
> +It might be confusing that the sum of all the 'children' overhead
> +values exceeds 100% since each of them is already an accumulation of

Just like you do in the next line:

> +'self' overhead of its child functions. But with this enabled, users

> +can find which function has the most overhead even if samples are
> +spread over the children.
> +
> +Consider the following example; thereâre three functions like below.

Humm, "there are", no? Else I'm learning something new here...

Googled, some controversy,
http://grammar.about.com/od/words/a/EnglishContractions.htm doesn't list
it as a "common contraction", to name just one of the sites, anyway,
found it unfamiliar:

In the kernel sources:

[acme@zoo linux]$ find . -type f | xargs grep -i "there're" | wc -l
15
[acme@zoo linux]$ find . -type f | xargs grep -i "there are" | wc -l
5343
[acme@zoo linux]$

> +
> +-----------------------
> +void foo(void) {
> + /* something */
> +}
> +
> +void bar(void) {
> + /* do something */
> + foo();
> +}
> +
> +int main(void) {
> + bar()
> + return 0;
> +}
> +-----------------------
> +
> +In this case 'foo' is a child of 'bar', and 'bar' is an immediate
> +child of 'main' so 'foo' also is a child of 'main'. In other words,
> +'main' is a parent of 'foo' and 'bar', and 'bar' is a parent of 'foo'.
> +
> +Suppose all samples are recorded in 'foo' and 'bar' only. When it's
> +recorded with callchains the output will be shown something like below

will show something like
> +in the usual (self-overhead-only) output of perf report:
> +
> +----------------------------------
> +Overhead Symbol
> +........ .....................
> + 60.00% foo
> + |
> + --- foo
> + bar
> + main
> + __libc_start_main
> +
> + 40.00% bar
> + |
> + --- bar
> + main
> + __libc_start_main
> +----------------------------------
> +
> +When the --children option is enabled, the (self) overhead of children (in
> +this case foo and bar) are added to the parent to calculate the
is
> +children overhead. In this case the report could be displayed as:
> +
> +-------------------------------------------
> +Children Self Symbol
> +........ ........ ....................
> + 100.00% 0.00% __libc_start_main
> + |
> + --- __libc_start_main
> +
> + 100.00% 0.00% main
> + |
> + --- main
> + __libc_start_main
> +
> + 100.00% 40.00% bar
> + |
> + --- bar
> + main
> + __libc_start_main
> +
> + 60.00% 60.00% foo
> + |
> + --- foo
> + bar
> + main
> + __libc_start_main
> +-------------------------------------------
> +
> +Since v3.16 the children overhead is shown by default and the output
> +is sorted by the values. Children overhead is disabled by specifying
'the values'? Which ones, maybe rephrase as 'its
values', i.e. the "parent + children total overhead" value?
> +--no-children option on the command line or by adding 'report.children
> += false' or 'top.children = false' in the perf config file.

One can as well use the OPTION_FOO shortening mechanism and instead use:

perf report --no-ch

Which is enough to disambiguate it from "--no-column-widths" and "--no-cpu".

> diff --git a/tools/perf/Documentation/perf-report.txt b/tools/perf/Documentation/perf-report.txt
> index 4879cf638824..b7bb298deee3 100644
> --- a/tools/perf/Documentation/perf-report.txt
> +++ b/tools/perf/Documentation/perf-report.txt
> @@ -193,6 +193,7 @@ OPTIONS
> Accumulate callchain of children to parent entry so that then can
> show up in the output. The output will have a new "Children" column
> and will be sorted on the data. It requires callchains are recorded.
> + See the `overhead calculation' section for more details.

`callchain overhead'

>
> --max-stack::
> Set the stack depth limit when parsing the callchain, anything
> @@ -323,6 +324,9 @@ OPTIONS
> --header-only::
> Show only perf.data header (forces --stdio).
>
> +
> +include::overhead-calculation.txt[]

+include::callchain-overhead.txt

> +
> SEE ALSO
> --------
> linkperf:perf-stat[1], linkperf:perf-annotate[1]
> diff --git a/tools/perf/Documentation/perf-top.txt b/tools/perf/Documentation/perf-top.txt
> index 3265b1070518..6e8fea6fd179 100644
> --- a/tools/perf/Documentation/perf-top.txt
> +++ b/tools/perf/Documentation/perf-top.txt
> @@ -168,7 +168,7 @@ Default is to monitor all CPUS.
> Accumulate callchain of children to parent entry so that then can
> show up in the output. The output will have a new "Children" column
> and will be sorted on the data. It requires -g/--call-graph option
> - enabled.
> + enabled. See the `overhead calculation' section for more details.

ditto.

>
> --max-stack::
> Set the stack depth limit when parsing the callchain, anything
> @@ -234,6 +234,7 @@ INTERACTIVE PROMPTING KEYS
>
> Pressing any unmapped key displays a menu, and prompts for input.
>
> +include::overhead-calculation.txt[]

Ditto.

>
> SEE ALSO
> --------
> --
> 2.3.5
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/