Re: [PATCH V4 1/3] riscv: Add perf callchain support

From: Mao Han
Date: Wed Aug 21 2019 - 06:57:11 EST


Hi Greentime,
On Wed, Aug 21, 2019 at 05:16:13PM +0800, Greentime Hu wrote:
> Hi Mao,
>
> Mao Han <han_mao@xxxxxxxxx> æ 2019å8æ20æ éä äå4:57åéï
> >
> > This patch add support for perf callchain sampling on riscv platform.
> > The return address of leaf function is retrieved from pt_regs as
> > it is not saved in the outmost frame.
> >
> >
>
> Not sure if I did something wrong. I encounter a build error when I
> try to build tools/perf/tests
>
> CC arch/riscv/util/dwarf-regs.o
> arch/riscv/util/dwarf-regs.c:64:5: error: no previous prototype for
> âregs_query_register_offsetâ [-Werror=missing-prototypes]
>

This seems becasue I didn't add PERF_HAVE_ARCH_REGS_QUERY_REGISTER_OFFSET
in tools/perf/arch/riscv/Makefile so the prototype in
./util/include/dwarf-regs.h is not declared. I'll add that in the next
version.

> I simply add its prototype and it could be built pass.
> This is my testing results.
> # ./perf test
> 1: vmlinux symtab matches kallsyms : Skip
> 2: Detect openat syscall event : FAILED!
> 3: Detect openat syscall event on all cpus : FAILED!
> 4: Read samples using the mmap interface : FAILED!
> 5: Test data source output : Ok
> 6: Parse event definition strings : FAILED!
> 7: Simple expression parser : Ok
> 8: PERF_RECORD_* events & perf_sample fields : FAILED!
> 9: Parse perf pmu format : Ok
> 10: DSO data read : Ok
> 11: DSO data cache : Ok
> 12: DSO data reopen : Ok
> 13: Roundtrip evsel->name : Ok
> 14: Parse sched tracepoints fields : FAILED!
> 15: syscalls:sys_enter_openat event fields : FAILED!
> 16: Setup struct perf_event_attr : FAILED!
> 17: Match and link multiple hists : Ok
> 18: 'import perf' in python : FAILED!
>
> 19: Breakpoint overflow signal handler : FAILED!
> 20: Breakpoint overflow sampling : FAILED!
> 21: Breakpoint accounting : Skip
> 22: Watchpoint :
> 22.1: Read Only Watchpoint : FAILED!
> 22.2: Write Only Watchpoint : FAILED!
> 22.3: Read / Write Watchpoint : FAILED!
> 22.4: Modify Watchpoint : FAILED!
> 23: Number of exit events of a simple workload : Ok
> 24: Software clock events period values : Ok
> 25: Object code reading : Ok
> 26: Sample parsing : Ok
> 27: Use a dummy software event to keep tracking : Ok
> 28: Parse with no sample_id_all bit set : Ok
> 29: Filter hist entries : Ok
> 30: Lookup mmap thread : Ok
> 31: Share thread mg : Ok
> 32: Sort output of hist entries : Ok
> 33: Cumulate child hist entries : Ok
> 34: Track with sched_switch : Ok
> 35: Filter fds with revents mask in a fdarray : Ok
> 36: Add fd to a fdarray, making it autogrow : Ok
> 37: kmod_path__parse : Ok
> 38: Thread map : Ok
> 39: LLVM search and compile :
> 39.1: Basic BPF llvm compile : Skip
> 39.2: kbuild searching : Skip
> 39.3: Compile source for BPF prologue generation : Skip
> 39.4: Compile source for BPF relocation : Skip
> 40: Session topology : FAILED!
> 41: BPF filter :
> 41.1: Basic BPF filtering : Skip
> 41.2: BPF pinning : Skip
> 41.3: BPF relocation checker : Skip
> 42: Synthesize thread map : Ok
> 43: Remove thread map : Ok
> 44: Synthesize cpu map : Ok
> 45: Synthesize stat config : Ok
> 46: Synthesize stat : Ok
> 47: Synthesize stat round : Ok
> 48: Synthesize attr update : Ok
> 49: Event times : Ok
> 50: Read backward ring buffer : Skip
> 51: Print cpu map : Ok
> 52: Probe SDT events : Skip
> 53: is_printable_array : Ok
> 54: Print bitmap : Ok
> 55: perf hooks : Ok
> 56: builtin clang support : Skip (not
> compiled in)
> 57: unit_number__scnprintf : Ok
> 58: mem2node : Ok
> 59: time utils : Ok
> 60: map_groups__merge_in : Ok
> 61: probe libc's inet_pton & backtrace it with ping : FAILED!
> 62: Add vfs_getname probe to get syscall args filenames : FAILED!
> 63: Check open filename arg using perf trace + vfs_getname: Skip
> 64: Use vfs_getname probe to get syscall args filenames : FAILED!
> 65: Zstd perf.data compression/decompression : Skip
>

The perf test result I got is quiet similar to yours, but with 5
less testcases.

1: vmlinux symtab matches kallsyms : Skip
2: Detect openat syscall event : FAILED!
3: Detect openat syscall event on all cpus : FAILED!
4: Read samples using the mmap interface : FAILED!
5: Test data source output : Ok
6: Parse event definition strings : FAILED!
7: Simple expression parser : Ok
8: PERF_RECORD_* events & perf_sample fields : FAILED!
9: Parse perf pmu format : Ok
10: DSO data read : Ok
11: DSO data cache : Ok
12: DSO data reopen : Ok
13: Roundtrip evsel->name : Ok
14: Parse sched tracepoints fields : FAILED!
15: syscalls:sys_enter_openat event fields : FAILED!
16: Setup struct perf_event_attr : Skip
17: Match and link multiple hists : Ok
18: 'import perf' in python : Ok
19: Breakpoint overflow signal handler : FAILED!
20: Breakpoint overflow sampling : FAILED!
21: Breakpoint accounting : Skip
22: Watchpoint :
22.1: Read Only Watchpoint : FAILED!
22.2: Write Only Watchpoint : FAILED!
22.3: Read / Write Watchpoint : FAILED!
22.4: Modify Watchpoint : FAILED!
23: Number of exit events of a simple workload : Ok
24: Software clock events period values : Ok
25: Object code reading : Ok
26: Sample parsing : Ok
27: Use a dummy software event to keep tracking: Ok
28: Parse with no sample_id_all bit set : Ok
29: Filter hist entries : Ok
30: Lookup mmap thread : Ok
31: Share thread mg : Ok
32: Sort output of hist entries : Ok
33: Cumulate child hist entries : Ok
34: Track with sched_switch : Ok
35: Filter fds with revents mask in a fdarray : Ok
36: Add fd to a fdarray, making it autogrow : Ok
37: kmod_path__parse : Ok
38: Thread map : Ok
39: LLVM search and compile :
39.1: Basic BPF llvm compile : Skip
39.2: kbuild searching : Skip
39.3: Compile source for BPF prologue generation: Skip
39.4: Compile source for BPF relocation : Skip
40: Session topology : FAILED!
41: BPF filter :
41.1: Basic BPF filtering : Skip
41.2: BPF pinning : Skip
41.3: BPF relocation checker : Skip
42: Synthesize thread map : Ok
43: Remove thread map : Ok
44: Synthesize cpu map : Ok
45: Synthesize stat config : Ok
46: Synthesize stat : Ok
47: Synthesize stat round : Ok
48: Synthesize attr update : Ok
49: Event times : Ok
50: Read backward ring buffer : Skip
51: Print cpu map : Ok
52: Probe SDT events : Skip
53: is_printable_array : Ok
54: Print bitmap : Ok
55: perf hooks : Ok
56: builtin clang support : Skip (not compiled in)
57: unit_number__scnprintf : Ok
58: mem2node : Ok
59: time utils : Ok
60: map_groups__merge_in : Ok

The comparison before/after applied this patch set:

/tools/perf/util# diff perf_test_before perf_test_after
1d0
< # perf test
8c7
< 7: Simple expression parser : FAILED!
---
> 7: Simple expression parser : Ok