[PATCH v5 7/9] perf cs-etm: Fixup exception entry for thread stack
From: Leo Yan
Date: Thu Feb 20 2020 - 00:29:01 EST
In theory when an exception is taken, the thread stack is pushed with an
expected return address (ret_addr): from_ip + insn_len; and later when
the exception returns back, it compares the return address (from the new
packet's to_ip) with the ret_addr in the of thread stack, if have the
same values then the thread stack will be popped.
When a branch instruction's target address triggers an exception, the
thread stack's ret_addr is the branch target address plus instruction
length for exception entry; but this branch instruction is not taken,
the exception return address is the branch target address, thus the
thread stack's ret_addr cannot match with the exception return address,
so the thread stack cannot pop properly.
This patch fixes up the ret_addr at the exception entry, when it detects
the exception is triggered by a branch target address, it sets
'insn_len' to zero. This allows the thread stack can pop properly when
return from exception.
Before:
# perf script --itrace=g16l64i100
main 3258 100 instructions:
ffff800010082c1c el0_sync+0x5c ([kernel.kallsyms])
ffffad816a14 memcpy+0x4 (/usr/lib/aarch64-linux-gnu/ld-2.28.so)
ffffad800820 _dl_start_final+0x48 (/usr/lib/aarch64-linux-gnu/ld-2.28.so)
ffffad800b00 _dl_start+0x200 (/usr/lib/aarch64-linux-gnu/ld-2.28.so)
ffffad800048 _start+0x8 (/usr/lib/aarch64-linux-gnu/ld-2.28.so)
ffffad800044 _start+0x4 (/usr/lib/aarch64-linux-gnu/ld-2.28.so)
The issues in the output:
memcpy+0x4 => The function call memcpy() causes exception; it's return
address should be memcpy+0x0.
_start+0x4 => The thread stack is not popped correctly, this is a stale
data which is left in the previous exception flow.
After:
# perf script --itrace=g16l64i100
main 3258 100 instructions:
ffff800010082c1c el0_sync+0x5c ([kernel.kallsyms])
ffffad816a10 memcpy+0x0 (/usr/lib/aarch64-linux-gnu/ld-2.28.so)
ffffad800820 _dl_start_final+0x48 (/usr/lib/aarch64-linux-gnu/ld-2.28.so)
ffffad800b00 _dl_start+0x200 (/usr/lib/aarch64-linux-gnu/ld-2.28.so)
ffffad800048 _start+0x8 (/usr/lib/aarch64-linux-gnu/ld-2.28.so)
Signed-off-by: Leo Yan <leo.yan@xxxxxxxxxx>
---
tools/perf/util/cs-etm.c | 22 ++++++++++++++++++++++
1 file changed, 22 insertions(+)
diff --git a/tools/perf/util/cs-etm.c b/tools/perf/util/cs-etm.c
index d9c22c145307..4800daf0dc3d 100644
--- a/tools/perf/util/cs-etm.c
+++ b/tools/perf/util/cs-etm.c
@@ -1160,6 +1160,7 @@ static void cs_etm__add_stack_event(struct cs_etm_queue *etmq,
u8 trace_chan_id = tidq->trace_chan_id;
int insn_len;
u64 from_ip, to_ip;
+ u32 flags;
if (etm->synth_opts.callchain || etm->synth_opts.thread_stack) {
from_ip = cs_etm__last_executed_instr(tidq->prev_packet);
@@ -1168,6 +1169,27 @@ static void cs_etm__add_stack_event(struct cs_etm_queue *etmq,
insn_len = cs_etm__instr_size(etmq, trace_chan_id,
tidq->prev_packet->isa, from_ip);
+ /*
+ * Fixup the exception entry.
+ *
+ * If the packet's start_addr is same with its end_addr, this
+ * packet was altered from a exception packet to a range packet;
+ * the detailed info is described in cs_etm__exception(), which
+ * is used to handle the case for a branch instruction is not
+ * taken but the branch triggers an exception.
+ *
+ * In this case, fixup 'insn_len' to zero so that allow the
+ * thread stack's return address can match with the exception
+ * return address, finally can pop up thread stack properly when
+ * return from exception.
+ */
+ flags = PERF_IP_FLAG_BRANCH | PERF_IP_FLAG_CALL |
+ PERF_IP_FLAG_INTERRUPT;
+ if (tidq->prev_packet->flags == flags &&
+ tidq->prev_packet->start_addr ==
+ tidq->prev_packet->end_addr)
+ insn_len = 0;
+
/*
* Create thread stacks by keeping track of calls and returns;
* any call pushes thread stack, return pops the stack, and
--
2.17.1