Re: [PATCH] perf/kvm: Guest Symbol Resolution for powerpc

From: Arnaldo Carvalho de Melo
Date: Wed Jan 13 2016 - 11:59:50 EST


Em Tue, Dec 29, 2015 at 03:38:40PM +0530, Ravi Bangoria escreveu:
> 'perf kvm {record|report}' is used to record and report the profiled
> performance of any workload on a guest. From the host, we can collect
> guest kernel statistics which is useful in finding out any contentions
> in guest kernel symbols for a certain workload.
> This feature is not available on powerpc because 'perf' relies on the
> 'cycles' event (a PMU event) to profile the guest. However, for powerpc,
> this can't be used from the host because the PMUs are controlled by the
> guest rather than the host.

Without entering the realms if the approach is the right one, which I
leave to PowerPC experts, Ingo, PeterZ, etc:

So, in these cases, please break this into a series, where you, for
instance, will add that extra evsel parameter to the functions that will
ultimately use it to extract those event fields, that should be a
separate patch, so that when reviewing the "meat" of your patch we can
quickly see what it does, not having to extract that from leg work.

Two other patches should introduce arch__get_{ip,cpumode}().

- Arnaldo

> Due to this issue, we need a different approach to profile the
> workload in the guest. There exists a tracepoint 'kvm_hv:kvm_guest_exit'
> in powerpc which is hit whenever any of the threads exit the guest
> context. The guest instruction pointer dumped along with this
> tracepoint data in the field 'pc', can be used as guest instruction
> pointer while postprocessing the trace data to map this IP to symbol
> from guest.kallsyms.
>
> However, to have some kind of periodicity, we can't use all the kvm
> exits, rather exits which are bound to happen in certain intervals.
> HV_DECREMENTER Interrupt forces the threads to exit after an interval
> of 10 ms.
>
> This patch makes use of the 'kvm_guest_exit' tracepoint and checks the
> exit reason for any kvm exit. If it is HV_DECREMENTER, then the
> instruction pointer dumped along with this tracepoint is retrieved and
> mapped with the guest kallsyms. So for powerpc, 'perf kvm record' will
> record 'kvm_hv:kvm_guest_exit' events instead of cycles.
>
> This patch will enable --guest option for perf kvm {record|report} on
> powerpc. Still --host --guest together won't work.
>
> This patch can be considered as a next iteration to RFC patch sent by
> Hemant Kumar: https://lkml.org/lkml/2015/6/15/670. Hemant's patch is used
> for enabling 'perf kvm report', while I've added code to enable
> 'perf kvm record' on powerpc.
>
> Before applying patch:
> [Note: one needs to run vm with kvm enabled]
>
> $ ./perf kvm --guestkallsyms=guest.kallsyms --guestmodules=guest.modules record -a
> [ perf record: Captured and wrote 1.530 MB perf.data.guest (28768 samples) ]
>
> $ ./perf script -i perf.data.guest
> qemu-system-ppc 9688 [000] 842566.451558: 1 cycles:ppp: c0000000001f2860 .mmap_region ([kernel.kallsyms])
> qemu-system-ppc 9688 [000] 842566.451562: 1 cycles:ppp: c0000000000a2d68 .kvmppc_do_h_enter ([kernel.kallsyms])
> qemu-system-ppc 9688 [000] 842566.451564: 7 cycles:ppp: c00000000001f26c .vsx_unavailable_tm ([kernel.kallsyms])
> qemu-system-ppc 9688 [000] 842566.451565: 138 cycles:ppp: c00000000001f26c .vsx_unavailable_tm ([kernel.kallsyms])
> qemu-system-ppc 9688 [000] 842566.451567: 3128 cycles:ppp: c0000000000097d8 ._switch ([kernel.kallsyms])
> qemu-system-ppc 9688 [000] 842566.451570: 81568 cycles:ppp: c0000000000ea8bc .wake_up_new_task ([kernel.kallsyms])
> swapper 0 [004] 842566.451580: 1 cycles:ppp: c0000000001f2d88 .sys_munmap ([kernel.kallsyms])
> swapper 0 [004] 842566.451583: 1 cycles:ppp: c00000000001f26c .vsx_unavailable_tm ([kernel.kallsyms])
> swapper 0 [004] 842566.451584: 11 cycles:ppp: c00000000001f26c .vsx_unavailable_tm ([kernel.kallsyms])
> swapper 0 [004] 842566.451585: 226 cycles:ppp: c0000000000097d4 ._switch ([kernel.kallsyms])
> swapper 0 [004] 842566.451586: 5664 cycles:ppp: c00000000000990c resume_kernel ([kernel.kallsyms])
> swapper 0 [004] 842566.451591: 147929 cycles:ppp: c00000000010a4fc .freeze_set_ops ([kernel.kallsyms])
> swapper 0 [008] 842566.451597: 1 cycles:ppp: c0000000001f2d98 .sys_munmap ([kernel.kallsyms])
> swapper 0 [008] 842566.451600: 1 cycles:ppp: c0000000000a2ee0 .kvmppc_do_h_enter ([kernel.kallsyms])
> swapper 0 [008] 842566.451602: 11 cycles:ppp: c0000000000a2ee0 .kvmppc_do_h_enter ([kernel.kallsyms])
> swapper 0 [008] 842566.451603: 224 cycles:ppp: c00000000001f274 .vsx_unavailable_tm ([kernel.kallsyms])
> swapper 0 [008] 842566.451604: 5240 cycles:ppp: c000000000009984 fast_exception_return ([kernel.kallsyms])
> swapper 0 [008] 842566.451608: 134752 cycles:ppp: c000000000780af4 .inet_diag_handler_get_info ([kernel.kallsyms])
> swapper 0 [012] 842566.451616: 1 cycles:ppp: c0000000001f2828 .mmap_region ([kernel.kallsyms])
> swapper 0 [012] 842566.451619: 1 cycles:ppp: c0000000000a2d78 .kvmppc_do_h_enter ([kernel.kallsyms])
> swapper 0 [012] 842566.451620: 11 cycles:ppp: c00000000001f26c .vsx_unavailable_tm ([kernel.kallsyms])
> swapper 0 [012] 842566.451621: 226 cycles:ppp: c0000000000097d4 ._switch ([kernel.kallsyms])
> swapper 0 [012] 842566.451623: 5549 cycles:ppp: c00000000000990c resume_kernel ([kernel.kallsyms])
>
> $ ./perf kvm --guestkallsyms=guest.kallsyms --guestmodules=guest.modules report --stdio
> # To display the perf.data header info, please use --header/--header-only options.
> #
> #
> # Total Lost Samples: 0
> #
> # Samples: 28K of event 'cycles:ppp'
> # Event count (approx.): 10473529515
> #
> # Overhead Command Shared Object Symbol
> # ........ ....... ............. ......
> #
>
> #
> # (For a higher level overview, try: perf report --sort comm,dso)
> #
> $
>
> After applying patch:
>
> $ ./perf kvm --guestkallsyms=guest.kallsyms --guestmodules=guest.modules record -a
> [ perf record: Captured and wrote 11.898 MB perf.data.guest (127299 samples) ]
>
> $ ./perf script -i perf.data.guest
> qemu-system-ppc 9690 [008] 857043.632783: kvm_hv:kvm_guest_exit: VCPU 12: trap=EXTERNAL pc=0xc00000000057a4f0 msr=0x8000000000009032, ceded=0
> qemu-system-ppc 9690 [008] 857043.632858: kvm_hv:kvm_guest_exit: VCPU 12: trap=SYSCALL pc=0xc000000000091b70 msr=0x8000000000009032, ceded=0
> qemu-system-ppc 9690 [008] 857043.632899: kvm_hv:kvm_guest_exit: VCPU 12: trap=SYSCALL pc=0xc000000000091a14 msr=0x8000000000001032, ceded=0
> qemu-system-ppc 9690 [008] 857043.632912: kvm_hv:kvm_guest_exit: VCPU 12: trap=SYSCALL pc=0xc000000000091924 msr=0x8000000000009032, ceded=0
> qemu-system-ppc 9690 [008] 857043.632923: kvm_hv:kvm_guest_exit: VCPU 12: trap=SYSCALL pc=0xc000000000091924 msr=0x8000000000009032, ceded=0
> qemu-system-ppc 9690 [008] 857043.632941: kvm_hv:kvm_guest_exit: VCPU 12: trap=SYSCALL pc=0xc000000000091b70 msr=0x8000000000009032, ceded=0
> qemu-system-ppc 9690 [008] 857043.632977: kvm_hv:kvm_guest_exit: VCPU 12: trap=SYSCALL pc=0xc000000000091b70 msr=0x8000000000009032, ceded=0
> qemu-system-ppc 9690 [008] 857043.633012: kvm_hv:kvm_guest_exit: VCPU 12: trap=SYSCALL pc=0xc000000000091924 msr=0x8000000000009032, ceded=0
> qemu-system-ppc 9690 [008] 857043.633033: kvm_hv:kvm_guest_exit: VCPU 12: trap=SYSCALL pc=0xc000000000091924 msr=0x8000000000009032, ceded=0
> qemu-system-ppc 9690 [008] 857043.633053: kvm_hv:kvm_guest_exit: VCPU 12: trap=SYSCALL pc=0xc000000000091924 msr=0x8000000000009032, ceded=0
> qemu-system-ppc 9690 [008] 857043.633077: kvm_hv:kvm_guest_exit: VCPU 12: trap=SYSCALL pc=0xc000000000091924 msr=0x8000000000009032, ceded=0
> qemu-system-ppc 9690 [008] 857043.633097: kvm_hv:kvm_guest_exit: VCPU 12: trap=SYSCALL pc=0xc000000000091924 msr=0x8000000000009032, ceded=0
> qemu-system-ppc 9690 [008] 857043.633109: kvm_hv:kvm_guest_exit: VCPU 12: trap=SYSCALL pc=0xc000000000091924 msr=0x8000000000009032, ceded=0
> qemu-system-ppc 9690 [008] 857043.633120: kvm_hv:kvm_guest_exit: VCPU 12: trap=SYSCALL pc=0xc000000000091924 msr=0x8000000000009032, ceded=0
> qemu-system-ppc 9690 [008] 857043.633599: kvm_hv:kvm_guest_exit: VCPU 12: trap=SYSCALL pc=0xc000000000091b70 msr=0x8000000000009032, ceded=0
> qemu-system-ppc 9690 [008] 857043.633637: kvm_hv:kvm_guest_exit: VCPU 12: trap=SYSCALL pc=0xc000000000091a14 msr=0x8000000000001032, ceded=0
> qemu-system-ppc 9690 [008] 857043.633650: kvm_hv:kvm_guest_exit: VCPU 12: trap=SYSCALL pc=0xc000000000091924 msr=0x8000000000009032, ceded=0
> qemu-system-ppc 9690 [008] 857043.633661: kvm_hv:kvm_guest_exit: VCPU 12: trap=SYSCALL pc=0xc000000000091924 msr=0x8000000000009032, ceded=0
>
> $ ./perf kvm --guestkallsyms=guest.kallsyms --guestmodules=guest.modules report --stdio
> # To display the perf.data header info, please use --header/--header-only options.
> #
> #
> # Total Lost Samples: 0
> #
> # Samples: 127K of event 'kvm_hv:kvm_guest_exit'
> # Event count (approx.): 127299
> #
> # Overhead Command Shared Object Symbol
> # ........ ....... ....................... ................................................
> #
> 0.02% :9690 [guest.kernel.kallsyms] [g] .plpar_hcall_norets
> 0.01% :9689 [guest.kernel.kallsyms] [g] .n_tty_write
> 0.00% :9690 [guest.kernel.kallsyms] [g] .n_tty_write
> 0.00% :9690 [unknown] [u] 0x00003fff966eb690
> 0.00% :9688 [guest.kernel.kallsyms] [g] .plpar_hcall_norets
> 0.00% :9689 [guest.kernel.kallsyms] [g] .plpar_hcall_norets
> 0.00% :9689 [unknown] [u] 0x00003fff866b8830
> 0.00% :9690 [unknown] [u] 0x00003fff966eb670
> 0.00% :9687 [guest.kernel.kallsyms] [g] .n_tty_write
> 0.00% :9689 [guest.kernel.kallsyms] [g] .__copy_tofrom_user_power7
> 0.00% :9689 [guest.kernel.kallsyms] [g] ._raw_spin_lock_irqsave
> 0.00% :9689 [guest.kernel.kallsyms] [g] .queue_work_on
> 0.00% :9690 [guest.kernel.kallsyms] [g] .flush_to_ldisc
> 0.00% :9690 [guest.kernel.kallsyms] [g] .plpar_hcall
> 0.00% :9690 [guest.kernel.kallsyms] [g] fast_exception_return
> 0.00% :9690 [unknown] [u] 0x00003fff966eb6a0
> 0.00% :9690 [unknown] [u] 0x00003fff966fd09c
> 0.00% :9687 [guest.kernel.kallsyms] [g] .__copy_tofrom_user_power7
> 0.00% :9688 [guest.kernel.kallsyms] [g] ._raw_spin_lock_irqsave
> 0.00% :9688 [guest.kernel.kallsyms] [g] .n_tty_write
> 0.00% :9688 [guest.kernel.kallsyms] [g] .plpar_hcall
> 0.00% :9689 [guest.kernel.kallsyms] [g] .__srcu_read_unlock
> 0.00% :9689 [guest.kernel.kallsyms] [g] ._raw_spin_lock
> 0.00% :9689 [guest.kernel.kallsyms] [g] .arch_local_irq_restore
>
> Signed-off-by: Ravi Bangoria <ravi.bangoria@xxxxxxxxxxxxxxxxxx>
> Signed-off-by: Hemant Kumar <hemant@xxxxxxxxxxxxxxxxxx>
> ---
> tools/perf/arch/powerpc/util/Build | 2 +
> tools/perf/arch/powerpc/util/evlist.c | 22 ++++++++++
> tools/perf/arch/powerpc/util/parse-tp.c | 74 +++++++++++++++++++++++++++++++++
> tools/perf/builtin-annotate.c | 3 +-
> tools/perf/builtin-diff.c | 3 +-
> tools/perf/builtin-mem.c | 10 +++--
> tools/perf/builtin-report.c | 3 +-
> tools/perf/builtin-script.c | 3 +-
> tools/perf/builtin-timechart.c | 8 ++--
> tools/perf/builtin-top.c | 3 +-
> tools/perf/tests/hists_cumulate.c | 2 +-
> tools/perf/tests/hists_filter.c | 2 +-
> tools/perf/tests/hists_link.c | 4 +-
> tools/perf/tests/hists_output.c | 2 +-
> tools/perf/util/event.c | 15 +++++--
> tools/perf/util/event.h | 3 +-
> tools/perf/util/evlist.c | 8 ++++
> tools/perf/util/evlist.h | 1 +
> tools/perf/util/evsel.c | 7 ++++
> tools/perf/util/evsel.h | 3 ++
> tools/perf/util/session.c | 9 ++--
> tools/perf/util/util.c | 5 +++
> tools/perf/util/util.h | 2 +
> 23 files changed, 169 insertions(+), 25 deletions(-)
> create mode 100644 tools/perf/arch/powerpc/util/evlist.c
> create mode 100644 tools/perf/arch/powerpc/util/parse-tp.c
>
> diff --git a/tools/perf/arch/powerpc/util/Build b/tools/perf/arch/powerpc/util/Build
> index 7b8b0d1..edd08e4 100644
> --- a/tools/perf/arch/powerpc/util/Build
> +++ b/tools/perf/arch/powerpc/util/Build
> @@ -1,5 +1,7 @@
> libperf-y += header.o
> libperf-y += sym-handling.o
> +libperf-y += parse-tp.o
> +libperf-y += evlist.o
>
> libperf-$(CONFIG_DWARF) += dwarf-regs.o
> libperf-$(CONFIG_DWARF) += skip-callchain-idx.o
> diff --git a/tools/perf/arch/powerpc/util/evlist.c b/tools/perf/arch/powerpc/util/evlist.c
> new file mode 100644
> index 0000000..6a16d72
> --- /dev/null
> +++ b/tools/perf/arch/powerpc/util/evlist.c
> @@ -0,0 +1,22 @@
> +#include <linux/err.h>
> +#include "../../util/evsel.h"
> +#include "../../util/evlist.h"
> +
> +/*
> + * To sample for only guest, record kvm_hv:kvm_guest_exit.
> + * Otherwise go via normal way(cycles).
> + */
> +int perf_evlist__arch_add_default(struct perf_evlist *evlist)
> +{
> + struct perf_evsel *evsel;
> +
> + if (!perf_guest_only())
> + return -1;
> +
> + evsel = perf_evsel__newtp_idx("kvm_hv", "kvm_guest_exit", 0);
> + if (IS_ERR(evsel))
> + return PTR_ERR(evsel);
> +
> + perf_evlist__add(evlist, evsel);
> + return 0;
> +}
> diff --git a/tools/perf/arch/powerpc/util/parse-tp.c b/tools/perf/arch/powerpc/util/parse-tp.c
> new file mode 100644
> index 0000000..50c4ac8
> --- /dev/null
> +++ b/tools/perf/arch/powerpc/util/parse-tp.c
> @@ -0,0 +1,74 @@
> +#include "../../util/evsel.h"
> +#include "../../util/trace-event.h"
> +#include "../../util/session.h"
> +#include "../../util/util.h"
> +
> +#define KVMPPC_EXIT "kvm_hv:kvm_guest_exit"
> +#define HV_DECREMENTER 2432
> +#define HV_BIT 3
> +#define PR_BIT 49
> +#define PPC_MAX 63
> +
> +static bool is_kvmppc_exit_event(struct perf_evsel *evsel)
> +{
> + static unsigned int kvmppc_exit;
> +
> + if (evsel->attr.type != PERF_TYPE_TRACEPOINT)
> + return false;
> +
> + if (unlikely(kvmppc_exit == 0)) {
> + if (strcmp(KVMPPC_EXIT, evsel->name))
> + return false;
> + kvmppc_exit = evsel->attr.config;
> + } else if (kvmppc_exit != evsel->attr.config) {
> + return false;
> + }
> +
> + return true;
> +}
> +
> +static bool is_hv_dec_trap(struct perf_evsel *evsel, struct perf_sample *sample)
> +{
> + int trap = perf_evsel__intval(evsel, sample, "trap");
> + return trap == HV_DECREMENTER;
> +}
> +
> +/*
> + * Get the instruction pointer from the tracepoint data
> + */
> +u64 arch__get_ip(struct perf_evsel *evsel, struct perf_sample *sample)
> +{
> + if (perf_guest_only() &&
> + is_kvmppc_exit_event(evsel) &&
> + is_hv_dec_trap(evsel, sample))
> + return perf_evsel__intval(evsel, sample, "pc");
> +
> + return sample->ip;
> +}
> +
> +/*
> + * Get the HV and PR bits and accordingly, determine the cpumode
> + */
> +u8 arch__get_cpumode(const union perf_event *event, struct perf_evsel *evsel,
> + struct perf_sample *sample)
> +{
> + unsigned long hv, pr, msr;
> + u8 cpumode = event->header.misc & PERF_RECORD_MISC_CPUMODE_MASK;
> +
> + if (!perf_guest_only() || !is_kvmppc_exit_event(evsel))
> + goto ret;
> +
> + if (sample->raw_data && is_hv_dec_trap(evsel, sample)) {
> + msr = perf_evsel__intval(evsel, sample, "msr");
> + hv = msr & ((unsigned long)1 << (PPC_MAX - HV_BIT));
> + pr = msr & ((unsigned long)1 << (PPC_MAX - PR_BIT));
> +
> + if (!hv && pr)
> + cpumode = PERF_RECORD_MISC_GUEST_USER;
> + else
> + cpumode = PERF_RECORD_MISC_GUEST_KERNEL;
> + }
> +
> +ret:
> + return cpumode;
> +}
> diff --git a/tools/perf/builtin-annotate.c b/tools/perf/builtin-annotate.c
> index 2bf9b3f..c4c9b45 100644
> --- a/tools/perf/builtin-annotate.c
> +++ b/tools/perf/builtin-annotate.c
> @@ -91,7 +91,8 @@ static int process_sample_event(struct perf_tool *tool,
> struct addr_location al;
> int ret = 0;
>
> - if (perf_event__preprocess_sample(event, machine, &al, sample) < 0) {
> + if (perf_event__preprocess_sample(event, machine, &al,
> + sample, evsel) < 0) {
> pr_warning("problem processing %d event, skipping it.\n",
> event->header.type);
> return -1;
> diff --git a/tools/perf/builtin-diff.c b/tools/perf/builtin-diff.c
> index 0b180a8..6ec7952 100644
> --- a/tools/perf/builtin-diff.c
> +++ b/tools/perf/builtin-diff.c
> @@ -330,7 +330,8 @@ static int diff__process_sample_event(struct perf_tool *tool __maybe_unused,
> struct hists *hists = evsel__hists(evsel);
> int ret = -1;
>
> - if (perf_event__preprocess_sample(event, machine, &al, sample) < 0) {
> + if (perf_event__preprocess_sample(event, machine, &al,
> + sample, evsel) < 0) {
> pr_warning("problem processing %d event, skipping it.\n",
> event->header.type);
> return -1;
> diff --git a/tools/perf/builtin-mem.c b/tools/perf/builtin-mem.c
> index 80170aa..ceb3977 100644
> --- a/tools/perf/builtin-mem.c
> +++ b/tools/perf/builtin-mem.c
> @@ -61,13 +61,15 @@ static int
> dump_raw_samples(struct perf_tool *tool,
> union perf_event *event,
> struct perf_sample *sample,
> - struct machine *machine)
> + struct machine *machine,
> + struct perf_evsel *evsel)
> {
> struct perf_mem *mem = container_of(tool, struct perf_mem, tool);
> struct addr_location al;
> const char *fmt;
>
> - if (perf_event__preprocess_sample(event, machine, &al, sample) < 0) {
> + if (perf_event__preprocess_sample(event, machine, &al,
> + sample, evsel) < 0) {
> fprintf(stderr, "problem processing %d event, skipping it.\n",
> event->header.type);
> return -1;
> @@ -111,10 +113,10 @@ out_put:
> static int process_sample_event(struct perf_tool *tool,
> union perf_event *event,
> struct perf_sample *sample,
> - struct perf_evsel *evsel __maybe_unused,
> + struct perf_evsel *evsel,
> struct machine *machine)
> {
> - return dump_raw_samples(tool, event, sample, machine);
> + return dump_raw_samples(tool, event, sample, machine, evsel);
> }
>
> static int report_raw_events(struct perf_mem *mem)
> diff --git a/tools/perf/builtin-report.c b/tools/perf/builtin-report.c
> index f256fac..5dcd6a5 100644
> --- a/tools/perf/builtin-report.c
> +++ b/tools/perf/builtin-report.c
> @@ -151,7 +151,8 @@ static int process_sample_event(struct perf_tool *tool,
> };
> int ret = 0;
>
> - if (perf_event__preprocess_sample(event, machine, &al, sample) < 0) {
> + if (perf_event__preprocess_sample(event, machine, &al,
> + sample, evsel) < 0) {
> pr_debug("problem processing %d event, skipping it.\n",
> event->header.type);
> return -1;
> diff --git a/tools/perf/builtin-script.c b/tools/perf/builtin-script.c
> index 72b5deb..b1b47a8 100644
> --- a/tools/perf/builtin-script.c
> +++ b/tools/perf/builtin-script.c
> @@ -715,7 +715,8 @@ static int process_sample_event(struct perf_tool *tool __maybe_unused,
> return 0;
> }
>
> - if (perf_event__preprocess_sample(event, machine, &al, sample) < 0) {
> + if (perf_event__preprocess_sample(event, machine, &al,
> + sample, evsel) < 0) {
> pr_err("problem processing %d event, skipping it.\n",
> event->header.type);
> return -1;
> diff --git a/tools/perf/builtin-timechart.c b/tools/perf/builtin-timechart.c
> index 30e5962..62f8369 100644
> --- a/tools/perf/builtin-timechart.c
> +++ b/tools/perf/builtin-timechart.c
> @@ -470,7 +470,8 @@ static void sched_switch(struct timechart *tchart, int cpu, u64 timestamp,
>
> static const char *cat_backtrace(union perf_event *event,
> struct perf_sample *sample,
> - struct machine *machine)
> + struct machine *machine,
> + struct perf_evsel *evsel)
> {
> struct addr_location al;
> unsigned int i;
> @@ -489,7 +490,8 @@ static const char *cat_backtrace(union perf_event *event,
> if (!chain)
> goto exit;
>
> - if (perf_event__preprocess_sample(event, machine, &al, sample) < 0) {
> + if (perf_event__preprocess_sample(event, machine, &al,
> + sample, evsel) < 0) {
> fprintf(stderr, "problem processing %d event, skipping it.\n",
> event->header.type);
> goto exit;
> @@ -569,7 +571,7 @@ static int process_sample_event(struct perf_tool *tool,
> if (evsel->handler != NULL) {
> tracepoint_handler f = evsel->handler;
> return f(tchart, evsel, sample,
> - cat_backtrace(event, sample, machine));
> + cat_backtrace(event, sample, machine, evsel));
> }
>
> return 0;
> diff --git a/tools/perf/builtin-top.c b/tools/perf/builtin-top.c
> index 7e2e72e..0869c18 100644
> --- a/tools/perf/builtin-top.c
> +++ b/tools/perf/builtin-top.c
> @@ -734,7 +734,8 @@ static void perf_event__process_sample(struct perf_tool *tool,
> if (event->header.misc & PERF_RECORD_MISC_EXACT_IP)
> top->exact_samples++;
>
> - if (perf_event__preprocess_sample(event, machine, &al, sample) < 0)
> + if (perf_event__preprocess_sample(event, machine, &al,
> + sample, evsel) < 0)
> return;
>
> if (!top->kptr_restrict_warned &&
> diff --git a/tools/perf/tests/hists_cumulate.c b/tools/perf/tests/hists_cumulate.c
> index 7ed7370..cca7568 100644
> --- a/tools/perf/tests/hists_cumulate.c
> +++ b/tools/perf/tests/hists_cumulate.c
> @@ -103,7 +103,7 @@ static int add_hist_entries(struct hists *hists, struct machine *machine)
> sample.callchain = (struct ip_callchain *)fake_callchains[i];
>
> if (perf_event__preprocess_sample(&event, machine, &al,
> - &sample) < 0)
> + &sample, evsel) < 0)
> goto out;
>
> if (hist_entry_iter__add(&iter, &al, PERF_MAX_STACK_DEPTH,
> diff --git a/tools/perf/tests/hists_filter.c b/tools/perf/tests/hists_filter.c
> index 818acf8..736321d 100644
> --- a/tools/perf/tests/hists_filter.c
> +++ b/tools/perf/tests/hists_filter.c
> @@ -81,7 +81,7 @@ static int add_hist_entries(struct perf_evlist *evlist,
> sample.ip = fake_samples[i].ip;
>
> if (perf_event__preprocess_sample(&event, machine, &al,
> - &sample) < 0)
> + &sample, evsel) < 0)
> goto out;
>
> al.socket = fake_samples[i].socket;
> diff --git a/tools/perf/tests/hists_link.c b/tools/perf/tests/hists_link.c
> index 8c102b0..4a0f0e3 100644
> --- a/tools/perf/tests/hists_link.c
> +++ b/tools/perf/tests/hists_link.c
> @@ -86,7 +86,7 @@ static int add_hist_entries(struct perf_evlist *evlist, struct machine *machine)
> sample.tid = fake_common_samples[k].pid;
> sample.ip = fake_common_samples[k].ip;
> if (perf_event__preprocess_sample(&event, machine, &al,
> - &sample) < 0)
> + &sample, evsel) < 0)
> goto out;
>
> he = __hists__add_entry(hists, &al, NULL,
> @@ -112,7 +112,7 @@ static int add_hist_entries(struct perf_evlist *evlist, struct machine *machine)
> sample.tid = fake_samples[i][k].pid;
> sample.ip = fake_samples[i][k].ip;
> if (perf_event__preprocess_sample(&event, machine, &al,
> - &sample) < 0)
> + &sample, evsel) < 0)
> goto out;
>
> he = __hists__add_entry(hists, &al, NULL,
> diff --git a/tools/perf/tests/hists_output.c b/tools/perf/tests/hists_output.c
> index adbebc8..48d0e95 100644
> --- a/tools/perf/tests/hists_output.c
> +++ b/tools/perf/tests/hists_output.c
> @@ -69,7 +69,7 @@ static int add_hist_entries(struct hists *hists, struct machine *machine)
> sample.ip = fake_samples[i].ip;
>
> if (perf_event__preprocess_sample(&event, machine, &al,
> - &sample) < 0)
> + &sample, evsel) < 0)
> goto out;
>
> if (hist_entry_iter__add(&iter, &al, PERF_MAX_STACK_DEPTH,
> diff --git a/tools/perf/util/event.c b/tools/perf/util/event.c
> index 8b10621..70ae70d 100644
> --- a/tools/perf/util/event.c
> +++ b/tools/perf/util/event.c
> @@ -983,6 +983,13 @@ void thread__find_addr_location(struct thread *thread,
> al->sym = NULL;
> }
>
> +u8 __weak arch__get_cpumode(const union perf_event *event,
> + __maybe_unused struct perf_evsel *evsel,
> + __maybe_unused struct perf_sample *sample)
> +{
> + return event->header.misc & PERF_RECORD_MISC_CPUMODE_MASK;
> +}
> +
> /*
> * Callers need to drop the reference to al->thread, obtained in
> * machine__findnew_thread()
> @@ -990,15 +997,17 @@ void thread__find_addr_location(struct thread *thread,
> int perf_event__preprocess_sample(const union perf_event *event,
> struct machine *machine,
> struct addr_location *al,
> - struct perf_sample *sample)
> + struct perf_sample *sample,
> + struct perf_evsel *evsel)
> {
> - u8 cpumode = event->header.misc & PERF_RECORD_MISC_CPUMODE_MASK;
> + u8 cpumode;
> struct thread *thread = machine__findnew_thread(machine, sample->pid,
> sample->tid);
> -
> if (thread == NULL)
> return -1;
>
> + al->cpumode = cpumode = arch__get_cpumode(event, evsel, sample);
> +
> dump_printf(" ... thread: %s:%d\n", thread__comm_str(thread), thread->tid);
> /*
> * Have we already created the kernel maps for this machine?
> diff --git a/tools/perf/util/event.h b/tools/perf/util/event.h
> index a0dbcbd..9b3c8b6 100644
> --- a/tools/perf/util/event.h
> +++ b/tools/perf/util/event.h
> @@ -457,7 +457,8 @@ struct addr_location;
> int perf_event__preprocess_sample(const union perf_event *event,
> struct machine *machine,
> struct addr_location *al,
> - struct perf_sample *sample);
> + struct perf_sample *sample,
> + struct perf_evsel *evsel);
>
> void addr_location__put(struct addr_location *al);
>
> diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
> index d139219..a0116f1 100644
> --- a/tools/perf/util/evlist.c
> +++ b/tools/perf/util/evlist.c
> @@ -219,6 +219,11 @@ void perf_event_attr__set_max_precise_ip(struct perf_event_attr *attr)
> }
> }
>
> +int __weak perf_evlist__arch_add_default(__maybe_unused struct perf_evlist *evlist)
> +{
> + return -1;
> +}
> +
> int perf_evlist__add_default(struct perf_evlist *evlist)
> {
> struct perf_event_attr attr = {
> @@ -227,6 +232,9 @@ int perf_evlist__add_default(struct perf_evlist *evlist)
> };
> struct perf_evsel *evsel;
>
> + if (!perf_evlist__arch_add_default(evlist))
> + return 0;
> +
> event_attr_init(&attr);
>
> perf_event_attr__set_max_precise_ip(&attr);
> diff --git a/tools/perf/util/evlist.h b/tools/perf/util/evlist.h
> index a459fe7..a7f01d8 100644
> --- a/tools/perf/util/evlist.h
> +++ b/tools/perf/util/evlist.h
> @@ -74,6 +74,7 @@ void perf_evlist__delete(struct perf_evlist *evlist);
>
> void perf_evlist__add(struct perf_evlist *evlist, struct perf_evsel *entry);
> void perf_evlist__remove(struct perf_evlist *evlist, struct perf_evsel *evsel);
> +int perf_evlist__arch_add_default(struct perf_evlist *evlist);
> int perf_evlist__add_default(struct perf_evlist *evlist);
> int __perf_evlist__add_default_attrs(struct perf_evlist *evlist,
> struct perf_event_attr *attrs, size_t nr_attrs);
> diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
> index 397fb4e..c86252e 100644
> --- a/tools/perf/util/evsel.c
> +++ b/tools/perf/util/evsel.c
> @@ -1581,6 +1581,12 @@ static inline bool overflow(const void *endp, u16 max_size, const void *offset,
> #define OVERFLOW_CHECK_u64(offset) \
> OVERFLOW_CHECK(offset, sizeof(u64), sizeof(u64))
>
> +u64 __weak arch__get_ip(__maybe_unused struct perf_evsel *evsel,
> + struct perf_sample *sample)
> +{
> + return sample->ip;
> +}
> +
> int perf_evsel__parse_sample(struct perf_evsel *evsel, union perf_event *event,
> struct perf_sample *data)
> {
> @@ -1754,6 +1760,7 @@ int perf_evsel__parse_sample(struct perf_evsel *evsel, union perf_event *event,
> OVERFLOW_CHECK(array, data->raw_size, max_size);
> data->raw_data = (void *)array;
> array = (void *)array + data->raw_size;
> + data->ip = arch__get_ip(evsel, data);
> }
>
> if (type & PERF_SAMPLE_BRANCH_STACK) {
> diff --git a/tools/perf/util/evsel.h b/tools/perf/util/evsel.h
> index 0e49bd7..7dca348 100644
> --- a/tools/perf/util/evsel.h
> +++ b/tools/perf/util/evsel.h
> @@ -398,4 +398,7 @@ typedef int (*attr__fprintf_f)(FILE *, const char *, const char *, void *);
> int perf_event_attr__fprintf(FILE *fp, struct perf_event_attr *attr,
> attr__fprintf_f attr__fprintf, void *priv);
>
> +u64 arch__get_ip(struct perf_evsel *evsel, struct perf_sample *sample);
> +u8 arch__get_cpumode(const union perf_event *event, struct perf_evsel *evsel,
> + struct perf_sample *sample);
> #endif /* __PERF_EVSEL_H */
> diff --git a/tools/perf/util/session.c b/tools/perf/util/session.c
> index c35ffdd..e12eb58 100644
> --- a/tools/perf/util/session.c
> +++ b/tools/perf/util/session.c
> @@ -953,10 +953,11 @@ static void dump_sample(struct perf_evsel *evsel, union perf_event *event,
> }
>
> static struct machine *machines__find_for_cpumode(struct machines *machines,
> - union perf_event *event,
> - struct perf_sample *sample)
> + union perf_event *event,
> + struct perf_sample *sample,
> + struct perf_evsel *evsel)
> {
> - const u8 cpumode = event->header.misc & PERF_RECORD_MISC_CPUMODE_MASK;
> + u8 cpumode = arch__get_cpumode(event, evsel, sample);
> struct machine *machine;
>
> if (perf_guest &&
> @@ -1060,7 +1061,7 @@ static int machines__deliver_event(struct machines *machines,
>
> evsel = perf_evlist__id2evsel(evlist, sample->id);
>
> - machine = machines__find_for_cpumode(machines, event, sample);
> + machine = machines__find_for_cpumode(machines, event, sample, evsel);
>
> switch (event->header.type) {
> case PERF_RECORD_SAMPLE:
> diff --git a/tools/perf/util/util.c b/tools/perf/util/util.c
> index 47b1e36..09ec0e4 100644
> --- a/tools/perf/util/util.c
> +++ b/tools/perf/util/util.c
> @@ -35,6 +35,11 @@ bool test_attr__enabled;
> bool perf_host = true;
> bool perf_guest = false;
>
> +bool perf_guest_only(void)
> +{
> + return !perf_host && perf_guest;
> +}
> +
> void event_attr_init(struct perf_event_attr *attr)
> {
> if (!perf_host)
> diff --git a/tools/perf/util/util.h b/tools/perf/util/util.h
> index dcc6590..207bdc8 100644
> --- a/tools/perf/util/util.h
> +++ b/tools/perf/util/util.h
> @@ -358,4 +358,6 @@ int fetch_kernel_version(unsigned int *puint,
> #define KVER_FMT "%d.%d.%d"
> #define KVER_PARAM(x) KVER_VERSION(x), KVER_PATCHLEVEL(x), KVER_SUBLEVEL(x)
>
> +bool perf_guest_only(void);
> +
> #endif /* GIT_COMPAT_UTIL_H */
> --
> 2.1.4