Re: [PATCH V6 08/17] perf tools: Add Intel PT support

From: Adrian Hunter
Date: Fri Jun 26 2015 - 02:51:09 EST


On 26/06/15 03:09, Arnaldo Carvalho de Melo wrote:
> Em Thu, Jun 25, 2015 at 08:56:34PM -0300, Arnaldo Carvalho de Melo escreveu:
>> Will do the same tests with intel_pt as well, on a remote machine, add examples
>> to the changeset logs and everything going well, aim for pushing for Ingo soon,
>
> So, I asked for callchains, with:
>
> perf record -g -e intel_bts// ls
>
> And it got stuck somewhere, then I did a perf top to see where it was,
> and got to:
>
> 96.24% perf [.] intel_bts_process_queue
>
> Annotating I get to:
>
> 1.17 â1a0:âââmov 0x8(%r13),%rdx
> â â test %rdx,%rdx
> 98.83 â âââje 1a0
>
>
> Which is an endless loop! Source code for intel_bts_process_buffer(),
> inlined there:
>
> while (sz > sizeof(struct branch)) {
> if (!branch->from && !branch->to)
> continue;
> err = intel_bts_synth_branch_sample(btsq, branch);
> if (err)
> break;
> branch += 1;
> sz -= sizeof(struct branch);
> }
>
> Can you fix this, please, so that I can fold it into where it was
> introduced, namely:
>
> commit 439ad895a2aecea09416206f023336297cc72efe
> Author: Adrian Hunter <adrian.hunter@xxxxxxxxx>
> Date: Fri May 29 16:33:39 2015 +0300
>
> perf tools: Add Intel BTS support

It is fixed as an unexpected side-effect of a following patch (which is probably why I didn't notice it - or perhaps I rolled the fix into the wrong patch O_o). The fix is in:

perf tools: Output sample flags and insn_len from intel_bts

intel_bts synthesizes samples. Fill in the new flags and insn_len
members with instruction information.

Signed-off-by: Adrian Hunter <adrian.hunter@xxxxxxxxx>


So what you want is:


diff --git a/tools/perf/util/intel-bts.c b/tools/perf/util/intel-bts.c
index 48bcbd607ef7..68bb6fede55b 100644
--- a/tools/perf/util/intel-bts.c
+++ b/tools/perf/util/intel-bts.c
@@ -304,7 +304,7 @@ static int intel_bts_process_buffer(struct intel_bts_queue *btsq,
struct auxtrace_buffer *buffer)
{
struct branch *branch;
- size_t sz;
+ size_t sz, bsz = sizeof(struct branch);
int err = 0;

if (buffer->use_data) {
@@ -318,14 +318,12 @@ static int intel_bts_process_buffer(struct intel_bts_queue *btsq,
if (!btsq->bts->sample_branches)
return 0;

- while (sz > sizeof(struct branch)) {
+ for (; sz > bsz; branch += 1, sz -= bsz) {
if (!branch->from && !branch->to)
continue;
err = intel_bts_synth_branch_sample(btsq, branch);
if (err)
break;
- branch += 1;
- sz -= sizeof(struct branch);
}
return err;
}


But obviously that will conflict with "perf tools: Output sample flags and insn_len from intel_bts"

Another thing, the intel_bts implementation does not support
"instructions" samples because there is no timing information to
use to create periodic samples. But callchains are added only
to "instructions" samples so there are no callchains in 'perf report'
for intel_bts. The call information is still available for
db-export and the example call-graph, though.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/