Re: [PATCH V2 3/3] perf regs x86: Add X86 specific arch__intr_reg_mask()
From: Arnaldo Carvalho de Melo
Date: Wed May 15 2019 - 15:30:26 EST
Em Tue, May 14, 2019 at 01:19:34PM -0700, kan.liang@xxxxxxxxxxxxxxx escreveu:
> From: Kan Liang <kan.liang@xxxxxxxxxxxxxxx>
>
> XMM registers can be collected on Icelake and later platforms.
>
> Add specific arch__intr_reg_mask(), which creating an event to check if
> the kernel and hardware can collect XMM registers.
>
> Test on Skylake which doesn't support XMM registers collection. There is
> nothing changed.
Thanks a lot for doing this and tested on both a machine without these
registers as well as on one with it.
Applied, together with Ravi's tested-by for the first two and the change
in the --user-regs doc,
Regards,
- Arnaldo
> #perf record -I?
> available registers: AX BX CX DX SI DI BP SP IP FLAGS CS SS R8 R9
> R10 R11 R12 R13 R14 R15
>
> Usage: perf record [<options>] [<command>]
> or: perf record [<options>] -- <command> [<options>]
>
> -I, --intr-regs[=<any register>]
> sample selected machine registers on
> interrupt, use '-I?' to list register names
>
> #perf record -I
> [ perf record: Woken up 1 times to write data ]
> [ perf record: Captured and wrote 0.905 MB perf.data (2520 samples) ]
>
> #perf evlist -v
> cycles: size: 112, { sample_period, sample_freq }: 4000, sample_type:
> IP|TID|TIME|CPU|PERIOD|REGS_INTR, read_format: ID, disabled: 1,
> inherit: 1, mmap: 1, comm: 1, freq: 1, task: 1, precise_ip: 3,
> sample_id_all: 1, exclude_guest: 1, mmap2: 1, comm_exec: 1, ksymbol:
> 1, bpf_event: 1, sample_regs_intr: 0xff0fff
>
> Test on Icelake which support XMM registers collection.
>
> #perf record -I?
> available registers: AX BX CX DX SI DI BP SP IP FLAGS CS SS R8 R9 R10
> R11 R12 R13 R14 R15 XMM0 XMM1 XMM2 XMM3 XMM4 XMM5 XMM6 XMM7 XMM8 XMM9
> XMM10 XMM11 XMM12 XMM13 XMM14 XMM15
>
> Usage: perf record [<options>] [<command>]
> or: perf record [<options>] -- <command> [<options>]
>
> -I, --intr-regs[=<any register>]
> sample selected machine registers on
> interrupt, use '-I?' to list register names
>
> #perf record -I
> [ perf record: Woken up 1 times to write data ]
> [ perf record: Captured and wrote 0.800 MB perf.data (318 samples) ]
>
> #perf evlist -v
> cycles: size: 112, { sample_period, sample_freq }: 4000, sample_type:
> IP|TID|TIME|CPU|PERIOD|REGS_INTR, read_format: ID, disabled: 1,
> inherit: 1, mmap: 1, comm: 1, freq: 1, task: 1, precise_ip: 3,
> sample_id_all: 1, exclude_guest: 1, mmap2: 1, comm_exec: 1, ksymbol:
> 1, bpf_event: 1, sample_regs_intr: 0xffffffff00ff0fff
>
> Signed-off-by: Kan Liang <kan.liang@xxxxxxxxxxxxxxx>
> ---
>
> Changes since V1:
> - Add specific arch__intr_reg_mask() support
> Drop specific has_non_gprs_support() and non_gprs_mask()
>
> tools/perf/arch/x86/include/perf_regs.h | 1 +
> tools/perf/arch/x86/util/perf_regs.c | 25 +++++++++++++++++++++++++
> 2 files changed, 26 insertions(+)
>
> diff --git a/tools/perf/arch/x86/include/perf_regs.h b/tools/perf/arch/x86/include/perf_regs.h
> index b732133..b7cd91a 100644
> --- a/tools/perf/arch/x86/include/perf_regs.h
> +++ b/tools/perf/arch/x86/include/perf_regs.h
> @@ -9,6 +9,7 @@
> void perf_regs_load(u64 *regs);
>
> #define PERF_REGS_MAX PERF_REG_X86_XMM_MAX
> +#define PERF_XMM_REGS_MASK (~((1ULL << PERF_REG_X86_XMM0) - 1))
> #ifndef HAVE_ARCH_X86_64_SUPPORT
> #define PERF_REGS_MASK ((1ULL << PERF_REG_X86_32_MAX) - 1)
> #define PERF_SAMPLE_REGS_ABI PERF_SAMPLE_REGS_ABI_32
> diff --git a/tools/perf/arch/x86/util/perf_regs.c b/tools/perf/arch/x86/util/perf_regs.c
> index 71d7604..c3d7479 100644
> --- a/tools/perf/arch/x86/util/perf_regs.c
> +++ b/tools/perf/arch/x86/util/perf_regs.c
> @@ -270,3 +270,28 @@ int arch_sdt_arg_parse_op(char *old_op, char **new_op)
>
> return SDT_ARG_VALID;
> }
> +
> +uint64_t arch__intr_reg_mask(void)
> +{
> + struct perf_event_attr attr = {
> + .type = PERF_TYPE_HARDWARE,
> + .config = PERF_COUNT_HW_CPU_CYCLES,
> + .sample_period = 1,
> + .sample_type = PERF_SAMPLE_REGS_INTR,
> + .sample_regs_intr = PERF_XMM_REGS_MASK,
> + .precise_ip = 1,
> + .disabled = 1,
> + .exclude_kernel = 1,
> + };
> + int fd;
> +
> + event_attr_init(&attr);
> +
> + fd = sys_perf_event_open(&attr, 0, -1, -1, 0);
> + if (fd != -1) {
> + close(fd);
> + return (PERF_XMM_REGS_MASK | PERF_REGS_MASK);
> + }
> +
> + return PERF_REGS_MASK;
> +}
> --
> 2.7.4
--
- Arnaldo