Re: [PATCH] perf intel-pt: Synthesize cycle events

From: Steinar H. Gunderson
Date: Fri Mar 11 2022 - 12:42:38 EST


On Fri, Mar 11, 2022 at 11:10:30AM +0200, Adrian Hunter wrote:
>> @@ -1633,7 +1639,7 @@ static int intel_pt_synth_instruction_sample(struct intel_pt_queue *ptq)
>> else
>> sample.period = ptq->state->tot_insn_cnt - ptq->last_insn_cnt;
>>
>> - if (ptq->sample_ipc)
>> + if (ptq->sample_ipc || pt->sample_cycles)
> This is not quite right. ptq->sample_ipc is set to indicate when the
> cycle count is accurate for the current instruction. It can be weakened
> by using "Approx IPC" which was introduced for dlfilter-show-cycles.
> Probably that approach should be followed for a "cycles" event also.

Thanks for the review!

I'm not sure if I understand this entirely. The point of the code here
is to get sample.cyc_cnt computed, even if we're not sampling IPC.
I've seen the approx IPC code, but I'm not entirely sure how it
interacts with this?

>> sample.cyc_cnt = ptq->ipc_cyc_cnt - ptq->last_in_cyc_cnt;
>> if (sample.cyc_cnt) {
>> sample.insn_cnt = ptq->ipc_insn_cnt - ptq->last_in_insn_cnt;
>> @@ -1643,8 +1649,30 @@ static int intel_pt_synth_instruction_sample(struct intel_pt_queue *ptq)
>>
>> ptq->last_insn_cnt = ptq->state->tot_insn_cnt;
> There are variables here that are specific to the "instructions" event, so
> mixing "cycles" with "instructions" means duplicating those, however maybe
> it would be better not to allow "y" and "i" options at the same time?

Given that a decode can easily take an hour, it would be really nice to
be able to keep y and i at the same time :-) (A long-standing pet peeve
of mine in perf is that you can't show two events side-by-side;
oprofile did that back in the day, at least on annotations.)

What specifically do you mean by duplicating? That we need to calculate
them twice in a way I don't do in my current patch, or something else?
It feels like I'm missing some context here.

>> - return intel_pt_deliver_synth_event(pt, event, &sample,
>> - pt->instructions_sample_type);
>> + if (pt->sample_instructions) {
>> + err = intel_pt_deliver_synth_event(pt, event, &sample,
>> + pt->instructions_sample_type);
>> + if (err)
>> + return err;
>> + }
>> +
>> + /*
>> + * NOTE: If not doing sampling (e.g. itrace=y0us), we will in practice
>> + * only see cycles being attributed to branches, since CYC packets
>> + * only are emitted together with other packets are emitted.
>> + * We should perhaps consider spreading it out over everything since
>> + * the last CYC packet, ie., since last time sample.cyc_cnt was nonzero.
>> + */
>> + if (pt->sample_cycles && sample.cyc_cnt) {
>> + sample.id = ptq->pt->cycles_id;
>> + sample.stream_id = ptq->pt->cycles_id;
> A "cycles" sample needs to set the sample period to the number of cycles since the
> last "cycles" sample.

OK. If I understand you right, is this as simple as sample.period =
sample.cyc_cnt?

/* Steinar */