Re: [RFC/PATCH 0/4] perf report: Support folded callchain output (v2)

From: Arnaldo Carvalho de Melo
Date: Mon Nov 02 2015 - 17:28:53 EST


Hi Namhyung,

Em Tue, Nov 03, 2015 at 07:12:04AM +0900, Namhyung Kim escreveu:
> On Mon, Nov 02, 2015 at 06:30:21PM -0300, Arnaldo Carvalho de Melo wrote:
> > Em Mon, Nov 02, 2015 at 12:37:28PM -0800, Brendan Gregg escreveu:
> > > On Mon, Nov 2, 2015 at 4:57 AM, Namhyung Kim <namhyung@xxxxxxxxxx> wrote:
> > > > This is what Brendan requested on the perf-users mailing list [1] to
> > > > support FlameGraphs [2] more efficiently. This patchset adds a few
> > > > more callchain options to adjust the output for it.

> > > > At first, 'folded' output mode was added. The folded output puts all
> > > > calchain nodes in a line separated by semicolons, a space and the
> > > > value. Now it only supports --stdio as other UI provides some way of
> > > > folding/expanding callchains dynamically.

> > > > The value is now can be one of 'percent', 'period', or 'count'. The
> > > > percent is current default output and the period is the raw number of
> > > > sample periods. The count is the number of samples for each callchain.

> > > > Here's an example:

> > > > $ perf report --no-children --show-nr-samples --stdio -g folded,count
> > > > ...
> > > > 39.93% 80 swapper [kernel.vmlinux] [k] intel_idel
> > > > intel_idle;cpuidle_enter_state;cpuidle_enter;call_cpuidle;cpu_startup_entry;start_secondary 57
> > > > intel_idle;cpuidle_enter_state;cpuidle_enter;call_cpuidle;cpu_startup_entry;rest_init;... 23

> > > So for the folded output I don't need the summary line (the row of
> > > columns printed by hist_entry__snprintf()), and don't need anything
> > > except folded stacks and the counts. If working with the existing
> > > stdio interface is making it harder than it needs to be, might it be

> > I don't think it so, just add some flag asking for that
> > hist_entry__snprintf() to be supressed, ideas for a long option name?

> > Having it as Namhyung did may have value for some people as a more
> > compact way to show the callchains together with the hist_entry line.

> Yeah, I'd keep the hist entry line unless it's too hard to
> parse/filter. IMHO it's just a way to show callchains, so no need to

What I suggested was to have something like:

$ perf report --no-children --no-hists --stdio -g folded,count
^^^^^^^^^^
^^^^^^^^^^
...
intel_idle;cpuidle_enter_state;cpuidle_enter;call_cpuidle;cpu_startup_entry;start_secondary 57
intel_idle;cpuidle_enter_state;cpuidle_enter;call_cpuidle;cpu_startup_entry;rest_init;... 23

I.e. the first entry in the callchain is 'intel_idle', just like in what
Brendan called the 'summary line', i.e. reduntant when what he wants its
just all the callchains and how many times they were sampled.

> have separate output mode..

> Brendan, I guess you still need to know other info like cpu or pid, no?

Possibly, but just with the callchains he has enough info for the basic
flame graph, no?

> And I feel like it'd be better to put the count before the callchains
> for consistency like below. Is it OK to you?

Consistency with what?

The main thing here is the callchain, all the other stuff are things
related to it, so showing it first makes sense to me.

Having some way to list the desired info to have for each callchain may
be interesting, and if he could do it like:

-g folded,count,cpu,other,fields

then he would know how to parse the per-callchain info at the end of
each line, right?

- Arnaldo

>
> $ perf report --no-children --show-nr-samples --stdio -g folded,count
> ...
> 39.93% 80 swapper [kernel.vmlinux] [k] intel_idel
> 57 intel_idle;cpuidle_enter_state;cpuidle_enter;call_cpuidle;cpu_startup_entry;start_secondary
> 23 intel_idle;cpuidle_enter_state;cpuidle_enter;call_cpuidle;cpu_startup_entry;rest_init;...
>
>
> >
> > With this in mind, do you have any other issues with Namhyung's
> > patchkit? An acked-by/tested-by you would be nice to have, and then we
> > could work out the new option to suppress that hist_entry__snprintf()
> > in a follow up patch.
> >
> > > easier to make it a separate interface (ui/folded), that just emitted
> > > the folded output? Just an idea. This existing patchset is working for
> > > me, I'd just be filtering the output.
> > >
> > > Having the option for percentages and periods is nice. I can envisage
> > > using periods (for latency flame graphs).
>
> Glad to see you like it. :)
>
> Thanks,
> Namhyung
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/