Re: [PATCH v2] perf tools: Document --children option in more detail

From: Namhyung Kim
Date: Tue Apr 21 2015 - 03:06:25 EST


Hi Ingo,

On Mon, Apr 20, 2015 at 08:04:11PM +0200, Ingo Molnar wrote:
>
> * Namhyung Kim <namhyung@xxxxxxxxxx> wrote:
>
> > As the --children option changes the output of perf report (and perf
> > top) it sometimes confuses users. Add more words and examples to help
> > understanding of the option's behavior - and how to disable it ;-).
>
> Nice! :-)

Thanks! :)

>
> > +++ b/tools/perf/Documentation/overhead.txt
>
> As some people might stumble upon this the old fashioned way, by
> looking around in 'Documentation/', I'm wondering how this file name
> will tell the reader that this is about call chains? (which isn't the
> default recording mode) Maybe name it more explicitly, like
> callchain-overhead.txt?

OK. But I'd rather name it 'overhead-calculation.txt' to align with
the section name. And 'callchain-overhead' makes me thinking about
the overhead of processing callchains at record or report time.


>
> > @@ -0,0 +1,96 @@
> > +Overhead calculation
> > +--------------------
> > +The overhead can be shown in two columns as 'Children' and 'Self' when
> > +perf collects callchains. The self overhead is simply calculated by
> > +adding all period values of the entry - usually a function (symbol).
> > +This is the value that perf shows traditionally and sum of the all
> > +self overhead should be 100%.
>
> s/sum of the all self overhead should be 100%
> /sum of all the self-overhead values should be 100%
>
> ?

OK

>
> > +
> > +The children overhead is calculated by adding all period values of the
> > +child functions so that it can show the total overhead of the higher
> > +level functions. The children here means that functions called from
> > +another function.
>
> Maybe put it into quotes to specify it more clearly? Also, maybe add a
> bit more info - Something like:
>
> The 'children' overhead is calculated by adding all period values of
> the child functions so that it can show the total overhead of the
> higher level functions even if they don't directly execute much.
> 'Children' here means functions that are called from another
> (parent) function.
>
> ?
>
> > +
> > +It might be confusing that the sum of the all children overhead
>
> s/children
> /'children'

Right!

>
> > +exceeds 100% since each of them is already an accumulation of (self)
> > +overhead of its children. But with this enabled, users can find which
> > +function has the most overhead even if samples are spread over the
> > +children.
> > +
> > +Consider the following example; thereâre three functions like below.
> > +
> > +-----------------------
> > +void foo(void) {
> > + /* something */
> > +}
> > +
> > +bar(void) {
>
> s/bar(void) {
> /void bar(void) {
>
> :-)

Oops..

>
> > + /* do something */
> > + foo();
> > +}
> > +
> > +int main(void) {
> > + bar()
> > + return 0;
> > +}
> > +-----------------------
> > +
> > +In this case 'foo' is a child of 'bar', and 'bar' is an immediate
> > +child of 'main' so 'foo' also is a child of 'main'. In other words,
> > +'main' is a parent of 'foo' and 'bar'. and 'bar' is a parent of 'foo'.
>
> s/and 'bar'. and
> /and 'bar', and

OK

>
> > +
> > +Suppose all samples are recorded in the 'foo' and 'bar' only. When
>
> either:
> Suppose all samples are recorded in the 'foo' and 'bar' functions only.
>
> or:
> Suppose all samples are recorded in 'foo' and 'bar' only.

I like the shorter. :)

>
> > +you record with callchain you'll see something like below in the usual
>
> s/callchain/callchains
>
> > +(self-overhead-only) output of the perf report:
>
> s/of the perf report/of perf report

OK

>
> > +
> > +----------------------------------
> > +Overhead Symbol
> > +........ .....................
> > + 60.00% foo
> > + |
> > + --- foo
> > + bar
> > + main
> > + __libc_start_main
> > +
> > + 40.00% bar
> > + |
> > + --- bar
> > + main
> > + __libc_start_main
> > +----------------------------------
> > +
> > +When --children option is enabled, the (self) overhead of children (in
>
> s/When --children option
> /When the --children option

OK

>
> > +this case foo and bar) are added to the parent to calculate the
> > +children overhead. In this case the report could be displayed as:
> > +
> > +-------------------------------------------
> > +Children Self Symbol
> > +........ ........ ....................
> > + 100.00% 0.00% __libc_start_main
> > + |
> > + --- __libc_start_main
> > +
> > + 100.00% 0.00% main
> > + |
> > + --- main
> > + __libc_start_main
> > +
> > + 100.00% 40.00% bar
> > + |
> > + --- bar
> > + main
> > + __libc_start_main
> > +
> > + 60.00% 60.00% foo
> > + |
> > + --- foo
> > + bar
> > + main
> > + __libc_start_main
> > +-------------------------------------------
> > +
> > +Since v3.16 the children overhead is shown by default and the output
> > +is sorted by the values. Children overhead is disabled by specifying
> > +--no-children option on the command line or by adding 'report.children
> > += false' or 'top.children = false' in the perfconfig file.
>
> s/perfconfig/.perfconfig
>
> ?

It seems we can have a system-wide /etc/perfconfig too. How about
'the perf config file' instead?

>
>
> > diff --git a/tools/perf/Documentation/perf-report.txt b/tools/perf/Documentation/perf-report.txt
> > index 4879cf638824..bd97e5c3c9b6 100644
> > --- a/tools/perf/Documentation/perf-report.txt
> > +++ b/tools/perf/Documentation/perf-report.txt
> > @@ -193,6 +193,7 @@ OPTIONS
> > Accumulate callchain of children to parent entry so that then can
> > show up in the output. The output will have a new "Children" column
> > and will be sorted on the data. It requires callchains are recorded.
> > + See `overhead calculation' section for more details.
>
> s/See `overhead calculation' section
> /See the `overhead calculation' section

OK

>
> >
> > --max-stack::
> > Set the stack depth limit when parsing the callchain, anything
> > @@ -323,6 +324,9 @@ OPTIONS
> > --header-only::
> > Show only perf.data header (forces --stdio).
> >
> > +
> > +include::overhead.txt[]
> > +
> > SEE ALSO
> > --------
> > linkperf:perf-stat[1], linkperf:perf-annotate[1]
> > diff --git a/tools/perf/Documentation/perf-top.txt b/tools/perf/Documentation/perf-top.txt
> > index 3265b1070518..526d6bebec1f 100644
> > --- a/tools/perf/Documentation/perf-top.txt
> > +++ b/tools/perf/Documentation/perf-top.txt
> > @@ -168,7 +168,7 @@ Default is to monitor all CPUS.
> > Accumulate callchain of children to parent entry so that then can
> > show up in the output. The output will have a new "Children" column
> > and will be sorted on the data. It requires -g/--call-graph option
> > - enabled.
> > + enabled. See `overhead calculation' section for more details.
>
> s/See `overhead calculation' section
> /See the `overhead calculation' section

OK

>
> With the above things fixed:

Will fix all my bad English. :)

>
> Reviewed-by: Ingo Molnar <mingo@xxxxxxxxxx>

Thanks for taking your time to review this!
Namhyung
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/