[RFC PATCH] perf session: Fixup timestamp for ordered events
From: Leo Yan
Date: Wed Nov 18 2020 - 05:55:49 EST
Perf tool relies on MMAP/MMAP2 events to prepare DSO maps, it allocates
DSO maps for MMAP/MMAP2 events which is used for parsing symbol. Thus,
during the recording, perf tool implictly expects the MMAP/MMAP2 events
should arrive before AUX event, in other words, MMAP/MMAP2's timestamp
should less than AUX event's timestamp, and the MMAP/MMAP2 events will
be added into the front of ordered event list; this can allow the DSO
maps to be prepared and can be consumed when process AUX event.
See the function perf_evlist__prepare_workload(), though it uses pipes
to handshake before perf process (the parent process) and forked process
for the profiled program, it cannot promise the timing between
MMAP/MMAP2 events and AUX event.
On Arm Juno board, the AUX event can be observed to arrive ahead than
MMAP2 event, with the command:
perf record -e cs_etm/@tmc_etr0/ -S -- dd if=/dev/zero of=/dev/null
The follow diagram depicts the flow for how the AUX event is arriving
ahead than MMAP2 event:
T1: T3: T4:
perf process Open PMU device Perf is scheduled out;
invoke perf_aux_output_end()
and generate AUX event
^ ^ ^
| | |
CPU0 ---------------------------------------------------> (T)
\
\ Forked process is placed on another CPU
V
CPU1 ---------------------------------------------------> (T)
| |
V V
T2: T5:
Invoke execvp() for profiled Record MMAP2 event
program
In this scenario, the perf main process runs on CPU0 and the profiled
program (which is forked child process) runs on CPU1. The main process
opens PMU device for AUX trace (T3) and it will generate AUX event until
the perf process is scheduled out (T4); the profiled program will be
launched by execvp() (T2) and later will record MMAP event for memory
mapping (T5).
Usually, the AUX event will be later than MMAP2 event, but on the Arm
Juno platform, it has chance that AUX event occurs prior to MMAP2 event
with two reasons:
- Arm Juno platform is big.LITTLE architecture, so CPU0 is big CPU and
CPU1 is LITTLE CPU, the performance between big CPU and LITTLE CPU is
significant, this gives chance for the perf main process to run much
faster than the profiled program;
- In the kernel, the RT thread (like kernel's CPUFreq thread) has chance
to preempt perf main thread, so when the perf main thread is
switched out, the AUX event will be generated and it might be early
than profiled program's MMAP2 event.
To fix this issue, this patch records the first AUX event's timestamp
into 'aux_timestamp', if find any MMAP/MMAP2 event is late coming, it
fixes up the MMAP/MMAP2 events' timestamp as 'aux_timestamp-1', so the
MMAP/MMAP2 event will be inserted into ordered list ahead than AUX event
and also will be handled before AUX event.
Signed-off-by: Leo Yan <leo.yan@xxxxxxxxxx>
---
tools/perf/util/session.c | 25 +++++++++++++++++++++++++
1 file changed, 25 insertions(+)
diff --git a/tools/perf/util/session.c b/tools/perf/util/session.c
index 098080287c68..1aa54941bf81 100644
--- a/tools/perf/util/session.c
+++ b/tools/perf/util/session.c
@@ -1753,11 +1753,36 @@ static s64 perf_session__process_event(struct perf_session *session,
if (tool->ordered_events) {
u64 timestamp = -1ULL;
+ static u64 aux_timestamp = -1ULL;
ret = perf_evlist__parse_sample_timestamp(evlist, event, ×tamp);
if (ret && ret != -1)
return ret;
+ /*
+ * Cache the first AUX event's timestamp into 'aux_timestamp',
+ * which is used to fixup MMAP/MMAP2's timestamp.
+ */
+ if ((event->header.type == PERF_RECORD_AUX) &&
+ (aux_timestamp == -1ULL))
+ aux_timestamp = timestamp;
+
+ /*
+ * If the AUX event arrives prior to MMAP/MMAP2 events, it's
+ * possible to have no chance to create DSOs when decode AUX
+ * trace data, thus the symbol cannot be parsed properly.
+ * To allow the DSOs are prepared before process AUX event,
+ * fixup the MMAP/MMAP2 events' timestamp to be prior to any
+ * AUX event's timestamp, so MMAP/MMAP2 events will be
+ * handled ahead and the DSO map will be prepared before AUX
+ * event handling.
+ */
+ if (event->header.type == PERF_RECORD_MMAP2 ||
+ event->header.type == PERF_RECORD_MMAP) {
+ if (timestamp > aux_timestamp)
+ timestamp = aux_timestamp - 1;
+ }
+
ret = perf_session__queue_event(session, event, timestamp, file_offset);
if (ret != -ETIME)
return ret;
--
2.17.1