Re: Cycles annotation support for perf tools v3
From: Arnaldo Carvalho de Melo
Date: Thu Aug 06 2015 - 22:00:20 EST
Em Sat, Jul 18, 2015 at 08:24:45AM -0700, Andi Kleen escreveu:
> [v2: Addressed review comments. Fixed display problems and
> correctly compute IPC now. See patches for detailed changes.]
> [v3: Merged with current Arnaldo perf/core and added acked-by.]
>
> [Note the respective kernel patches to report cycles are in
> peterz's perf/core queue, but so far not in tip. The patchkit
> can be tested however with the "fake cycles" debug patch added at
> the end]
>
> The upcoming Skylake CPU has a new timed branch stack feature,
> that reports cycle counts for individual branches in the
> last branch record.
>
> This allows to get fine grained cost information for code, and also allows
> to compute fine grained IPC.
Thanks, applied.
- Arnaldo
> Available from
> git://git.kernel.org/pub/scm/linux/kernel/git/ak/linux-misc.git perf/skl-tools3
>
> This patchkit adds support for this in the perf tools:
> - Basic support for the cycles field like other branch fields
> - Show cycles in the standard branch sort view (no IPC here,
> as IPC needs the instruction counts from annotation)
> - Annotate cycles and IPC in the assembler annotate view
> - Add branch support to top, so we can do live annotation.
> - Misc support, like dumping it in perf report -D
>
> Example output for annotate (with made up numbers):
>
> The second column is the IPC and third average cycles for the basic block.
>
> â static int hex(char ch) â
> â { â
> 0.12 â push %rbp â
> 0.12 â mov %rsp,%rbp â
> 0.12 â sub $0x20,%rsp â
> 0.12 â mov %edi,%eax â
> 0.12 â mov %al,-0x14(%rbp) â
> 0.12 â mov %fs:0x28,%rax â
> 0.12 â mov %rax,-0x8(%rbp) â
> 0.12 â xor %eax,%eax â
> â if ((ch >= '0') && (ch <= '9')) â
> 0.12 â cmpb $0x2f,-0x14(%rbp) â
> 66.67 0.12 123 â â jle 31 â
> 0.12 â cmpb $0x39,-0x14(%rbp) â
> 0.12 123 â â jg 31 â
> â return ch - '0'; â
> 22.22 0.12 â movsbl -0x14(%rbp),%eax â
> 0.12 â sub $0x30,%eax â
> 0.12 123 â â jmp 60 â
> â if ((ch >= 'a') && (ch <= 'f')) â
> 0.06 â31: cmpb $0x60,-0x14(%rbp) â
> 0.06 123 â â jle 46 â
> 0.06 â cmpb $0x66,-0x14(%rbp) â
> 0.06 â â jg 46 â
> â return ch - 'a' + 10; â
> 0.06 â movsbl -0x14(%rbp),%eax
>
> Example output for branch view (again with fake data):
>
> Overhead Command Source Shared Object Source Symbol Target Symbol Basic Block Cycles â
> 30.08% tcall tcall [.] f1 [.] f2 123 â
> 27.44% tcall tcall [.] f2 [.] f1 123 â
> 15.60% tcall tcall [.] main [.] f1 123 â
> 12.96% tcall tcall [.] f1 [.] main 123 â
> 12.86% tcall tcall [.] main [.] main 123 â
> 0.08% tcall [kernel.kallsyms] [k] hrtimer_interrupt [k] hrtimer_interrupt 123
>
> IPC computation has a few limitations (see the comments in the respective patches),
> in particular it punts on overlaping basic blocks.
>
> The annotation only works for the interactive annotation. Currently it is not
> working in the scripted perf annotate, as that is missing a lot of the
> infrastructure needed for per instruction state.
>
> It would be nice to add column headers to annotate.
>
> So far no support in --branch-history or in perf script.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/