[PATCH] LoongArch: Fix callchain parse error with kernel tracepoint events
From: Huacai Chen
Date: Tue Apr 23 2024 - 03:43:41 EST
In order to fix perf's callchain parse error for LoongArch, we implement
perf_arch_fetch_caller_regs() which fills several necessary registers
used for callchain unwinding, including sp, fp, and era. This is similar
to the following commits.
commit b3eac0265bf6:
("arm: perf: Fix callchain parse error with kernel tracepoint events")
commit 5b09a094f2fb:
("arm64: perf: Fix callchain parse error with kernel tracepoint events")
commit 9a7e8ec0d4cc:
("riscv: perf: Fix callchain parse error with kernel tracepoint events")
Test with commands:
perf record -e sched:sched_switch -g --call-graph dwarf
perf report
Without this patch:
Children Self Command Shared Object Symbol
........ ........ ............. ................. ....................
43.41% 43.41% swapper [unknown] [k] 0000000000000000
10.94% 10.94% loong-container [unknown] [k] 0000000000000000
|
|--5.98%--0x12006ba38
|
|--2.56%--0x12006bb84
|
--2.40%--0x12006b6b8
With this patch, callchain can be parsed correctly:
Children Self Command Shared Object Symbol
........ ........ ............. ................. ....................
47.57% 47.57% swapper [kernel.vmlinux] [k] __schedule
|
---__schedule
26.76% 26.76% loong-container [kernel.vmlinux] [k] __schedule
|
|--13.78%--0x12006ba38
| |
| |--9.19%--__schedule
| |
| --4.59%--handle_syscall
| do_syscall
| sys_futex
| do_futex
| futex_wait
| futex_wait_queue_me
| hrtimer_start_range_ns
| __schedule
|
|--8.38%--0x12006bb84
| handle_syscall
| do_syscall
| sys_epoll_pwait
| do_epoll_wait
| schedule_hrtimeout_range_clock
| hrtimer_start_range_ns
| __schedule
|
--4.59%--0x12006b6b8
handle_syscall
do_syscall
sys_nanosleep
hrtimer_nanosleep
do_nanosleep
hrtimer_start_range_ns
__schedule
Reported-by: Youling Tang <tangyouling@xxxxxxxxxx>
Suggested-by: Youling Tang <tangyouling@xxxxxxxxxx>
Signed-off-by: Huacai Chen <chenhuacai@xxxxxxxxxxx>
---
arch/loongarch/include/asm/perf_event.h | 6 ++++++
1 file changed, 6 insertions(+)
diff --git a/arch/loongarch/include/asm/perf_event.h b/arch/loongarch/include/asm/perf_event.h
index 2a35a0bc2aaa..157c4ace69d0 100644
--- a/arch/loongarch/include/asm/perf_event.h
+++ b/arch/loongarch/include/asm/perf_event.h
@@ -9,4 +9,10 @@
#define perf_arch_bpf_user_pt_regs(regs) (struct user_pt_regs *)regs
+#define perf_arch_fetch_caller_regs(regs, __ip) { \
+ (regs)->csr_era = (__ip); \
+ (regs)->regs[3] = current_stack_pointer; \
+ (regs)->regs[22] = (unsigned long) __builtin_frame_address(0); \
+}
+
#endif /* __LOONGARCH_PERF_EVENT_H__ */
--
2.43.0