Re: [PATCHSET 0/4] perf report: Support folded callchain output (v4)

From: Arnaldo Carvalho de Melo
Date: Wed Nov 04 2015 - 13:09:15 EST


Em Thu, Nov 05, 2015 at 12:34:57AM +0900, Namhyung Kim escreveu:
> Hi Arnaldo and Brendan,
>
> On Wed, Nov 04, 2015 at 11:51:31AM -0300, Arnaldo Carvalho de Melo wrote:
> > Em Tue, Nov 03, 2015 at 10:02:32PM -0800, Brendan Gregg escreveu:
> > > On Tue, Nov 3, 2015 at 5:54 PM, Namhyung Kim <namhyung@xxxxxxxxxx> wrote:
> > > > Ah, makes sense. So it'd look like
> >
> > > > $ perf report --stdio -g folded,count,info -F none -s comm
> > > > $ perf report --stdio -g folded,count,info -F none -s pid
> >
> > > > The output would be
> >
> > > > 809 swapper-0 cpu_bringup_and_idle;cpu_startup_entry;default_idle_call;arch_cpu_idle;default_idle;xen_hypercall_sched_op
> >
> > > Thanks, looks almost right: a couple of minor changes:
> >
> > > 1. If perf already has the precedent of "PID:comm", instead of my
> > > "comm-PID", then maybe it should use "PID:comm" for perf consistency.
> > > Doesn't make much difference to me.

> Right. Actually I'd like to write it that way.. ;-)

Well, those are two pieces of information: "comm" and "pid", so it would
be nice that we could take this opportunity to remove it, i.e. just
treat it as any other field and separate it via the designated
separator, and only show the ones specified.

> > > 2. The second space, delimiting "PID:comm" (or comm) and the stack...
> > > I'm nervous about using space as a delimiter any more than once, since
> > > it can also appear in comm (eg, "java main") and frames (eg,
> > > "JavaCalls::call_helper(JavaValue*, methodHandle*, JavaCallArguments*,
> > > Thread*)" -- that's direct from "perf script"!). I'd consider making
> > > it a semicolon:

The C++ symbol names are the biggest challenge here for a single line in
CSV ("comma" quoted) record :-\

> Fair enough.

> > > 809 swapper-0;cpu_bringup_and_idle;cpu_startup_entry;...
> >
> > > So the output is "value key", and key is a semicolon delimited stack
> > > with an optional comm or PID:comm frame at the start.
> >
> > Agreed, but then, we can have some sort of default and also be able to,
> > using -F, specify what are the fields we want, and in which order, and I
> > liked your suggestion of being able to specify "-F none" and that mean
> > no hist line to be produced.
> >
> > Likewise, the way that each callchain line should be formatted should be
> > programmable via the command line, via the -g option, no? Then script
> > writers could use it in a way that doesn't requires further processing,
> > as Brendan showed.
>
> Right. So '-s <key1>[,<key2>,...] -g info' can control which info is
> displayed along with the callchains.

So you force the same selection of fields to be used for both the
hist_entry and the callchains?

And why is that some of the fields will be selected via -s (comm, dso)
and other fields will be selected via -g (count, this "info" thing)?

Why not be flexible and allow any set of fields to be used in both
cases, without one being tied to the other?

I.e. instead of:

-s <key1>[,<key2>,...] -g info

We use:

-s <key1>[,<key2>,...] -g [<keyA>[,<keyB>],...]

If one would want to have the same set for both, then yeah, a keyword
for that would be interesting, reusing your "info":

-s <key1>[,<key2>,...] -g info

Would mean:

-s <key1>[,<key2>,...] -g [<key1>[,<key2>],...]

With both ... equal

But "info" is way too vague, perhaps "hist_keys", or something more
compact, like: "\-s", to reuse the semantic of regular expression groups
(\1).

> $ perf report -s comm,dso -g folded,count,info -F none
> 809 swapper;[kernel.vmlinux];cpu_bringup_and_idle;cpu_startup_entry;...

> Note that the info part (swapper;[kernel.vmlinux]) is also separated
> by a semicolon. But I think it's ok since it's controlled by command
> line, so script can know how many entries will be.

> > But yeah, the value is the semicolon delimited stack all the way to the
> > comm/PID:comm if there are more than one or if the user asks it to be
> > there via a -g keyword, all the other counts/info are just relative to
> > that, CSV or whatever other delimiter the user asks it to, and space is
> > not an option, as we know it can appear in the middle of a COMM:
>
> Yes, I think that we should use a given separator (using -t option)
> instead of hard-coded semicolon. Although it'd be rare, it seems
> possible to use semicolons in the comm name too.

Well, we can have an option to specify what would be the separator for
the callchains.

- Arnaldo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/