Re: [PATCH v2 1/2] perf cs-etm: Fix decoding for sparse CPU maps

From: James Clark

Date: Mon Jan 19 2026 - 10:16:05 EST

On 19/01/2026 2:53 pm, Leo Yan wrote:

On Mon, Jan 19, 2026 at 11:55:53AM +0000, James Clark wrote:

[...]

0 0 0x200 [0x90]: PERF_RECORD_ID_INDEX nr: 4
... id: 771 idx: 0 cpu: 2 tid: -1
... id: 772 idx: 1 cpu: 3 tid: -1
... id: 773 idx: 0 cpu: 2 tid: -1
... id: 774 idx: 1 cpu: 3 tid: -1

Seems to me that this patch works around the issue by using the CPU ID
instead, but event->auxtrace.idx is broken.

I don't think it's a workaround, I think it should have been written this
way in the first place.

Agree. Sorry for confusion in my rely.

[...]

@@ -3086,7 +3086,7 @@ static int cs_etm__queue_aux_fragment(struct perf_session *session, off_t file_o
if (aux_offset >= auxtrace_event->offset &&
aux_offset + aux_size <= auxtrace_event->offset + auxtrace_event->size) {
- struct cs_etm_queue *etmq = etm->queues.queue_array[auxtrace_event->idx].priv;
+ struct cs_etm_queue *etmq = cs_etm__get_queue(etm, auxtrace_event->cpu);

During decoding, cs_etm allocates its own AUX queues and needs to
convert the AUX records into a cs_etm-specific format.

For example:

event_id=405 idx=0 cpu=0 tid=6673
event_id=406 idx=1 cpu=1 tid=6673
event_id=407 idx=2 cpu=3 tid=6673
event_id=408 idx=3 cpu=4 tid=6673
event_id=409 idx=4 cpu=5 tid=6673
` ` `> CPU ID
`> Event ID `> Generic buffer IDs

In this case, I deliberately hotplugged off CPU2. As a result, the
buffer IDs are consecutive from 0 to 4, while the CPU ID for CPU2 is
missing. When adding AUX records to the cs_etm queues, we need to
convert from the generic buffer index to the corresponding CPU ID.

So the above change makes sense to me.

/*
* If this AUX event was inside this buffer somewhere, create a new auxtrace event
@@ -3095,6 +3095,7 @@ static int cs_etm__queue_aux_fragment(struct perf_session *session, off_t file_o
auxtrace_fragment.auxtrace = *auxtrace_event;
auxtrace_fragment.auxtrace.size = aux_size;
auxtrace_fragment.auxtrace.offset = aux_offset;
+ auxtrace_fragment.auxtrace.idx = etmq->queue_nr;

I am still confused about this. Because above auxtrace will be queued
on cs_etm queues, should we convert from the generic buffer index to the
cs_etm specific one? E.g.,

auxtrace_fragment.auxtrace.idx = auxtrace_event->cpu;

This is basically what the change is already doing. etmq->queue_nr == CPU because cs_etm__setup_queue() is called for every CPU up to max_cpu, not only for ones that were recorded so it has a 1:1 mapping. Using etmq->queue_nr also works in per-thread mode where queue 0 is used, which your suggestion doesn't handle. Otherwise they would be equivalent.

To be honest the whole decoder has become hacks on top of hacks. I think we might want to do a full from scratch re-write when we come to do the proper timestamp to MMAP event correlation or the change to not keep resetting the decoder for TRBE wrap mode.

BTW, if my understanding above is valid, it is good to go through the
cs_etm.c file for the "idx <-> CPU ID" conversion.

Do you mean there are other mistakes? I did check the rest and did some testing and didn't see any issues.

Thanks,
Leo

file_offset += aux_offset - auxtrace_event->offset + auxtrace_event->header.size;
pr_debug3("CS ETM: Queue buffer size: %#"PRI_lx64" offset: %#"PRI_lx64

--
2.34.1

_______________________________________________
CoreSight mailing list -- coresight@xxxxxxxxxxxxxxxx
To unsubscribe send an email to coresight-leave@xxxxxxxxxxxxxxxx