Re: [PATCH 1/2] perf/ibs: Fix interface via core pmu events

From: Namhyung Kim
Date: Thu Mar 02 2023 - 17:10:33 EST


Hi Ravi,

On Thu, Mar 2, 2023 at 1:22 AM Ravi Bangoria <ravi.bangoria@xxxxxxx> wrote:
>
> Although, IBS pmu can be invoked via it's own interface, indirect
> IBS invocation via core pmu event is also supported with fixed set
> of events: cpu-cycles:p, r076:p (same as cpu-cycles:p) and r0C1:p
> (micro-ops) for user convenience.
>
> This indirect IBS invocation is broken since commit 66d258c5b048
> ("perf/core: Optimize perf_init_event()"), which added RAW pmu
> under pmu_idr list and thus if event_init() fails with RAW pmu,
> it started returning error instead of trying other pmus.
>
> Fix it by introducing new pmu capability PERF_PMU_CAP_FORWARD_EVENT.
> Kernel will try to open event on other pmus if user requested pmu,
> having this capability, fails to open event.
>
> Without patch:
> $ sudo ./perf record -C 0 -e r076:p -- sleep 1
> Error:
> The r076:p event is not supported.
>
> With patch:
> $ sudo ./perf record -C 0 -e r076:p -- sleep 1
> [ perf record: Woken up 1 times to write data ]
> [ perf record: Captured and wrote 0.341 MB perf.data (37 samples) ]
>
> This new capability does not have a notion of forward pmu mapping.
> i.e. it doesn't know which pmu(or set of pmus) the event should be
> forwarded to. As of now, only AMD core pmu forwards a set of events
> to IBS pmu when precise_ip attribute is set and thus trying with all
> pmus works. But if more pmus start using this capability, some sort
> of forward pmu mapping needs to be introduced through which the event
> can directly get forwarded to only mapped pmus. Otherwise, trying all
> pmus can inadvertently open event on wrong pmu.
>
> Fixes: 66d258c5b048 ("perf/core: Optimize perf_init_event()")
> Reported-by: Stephane Eranian <eranian@xxxxxxxxxx>
> Signed-off-by: Ravi Bangoria <ravi.bangoria@xxxxxxx>
> ---
> arch/x86/events/amd/core.c | 5 +++++
> arch/x86/events/core.c | 2 ++
> arch/x86/events/perf_event.h | 3 +++
> include/linux/perf_event.h | 1 +
> kernel/events/core.c | 11 ++++++++---
> 5 files changed, 19 insertions(+), 3 deletions(-)
>
> diff --git a/arch/x86/events/amd/core.c b/arch/x86/events/amd/core.c
> index 8c45b198b62f..f4c67362cfde 100644
> --- a/arch/x86/events/amd/core.c
> +++ b/arch/x86/events/amd/core.c
> @@ -1264,6 +1264,11 @@ static __initconst const struct x86_pmu amd_pmu = {
> .cpu_dead = amd_pmu_cpu_dead,
>
> .amd_nb_constraints = 1,
> + /*
> + * Raw events with precise attribute set needs to be
> + * forwarded to IBS pmu.
> + */
> + .capabilities = PERF_PMU_CAP_FORWARD_EVENT,
> };
>
> static ssize_t branches_show(struct device *cdev,
> diff --git a/arch/x86/events/core.c b/arch/x86/events/core.c
> index d096b04bf80e..3f27b44f337a 100644
> --- a/arch/x86/events/core.c
> +++ b/arch/x86/events/core.c
> @@ -2156,6 +2156,8 @@ static int __init init_hw_perf_events(void)
> if (err)
> goto out1;
>
> + pmu.capabilities |= x86_pmu.capabilities;
> +
> if (!is_hybrid()) {
> err = perf_pmu_register(&pmu, "cpu", PERF_TYPE_RAW);
> if (err)
> diff --git a/arch/x86/events/perf_event.h b/arch/x86/events/perf_event.h
> index d6de4487348c..41e792bb442d 100644
> --- a/arch/x86/events/perf_event.h
> +++ b/arch/x86/events/perf_event.h
> @@ -941,6 +941,9 @@ struct x86_pmu {
> int num_hybrid_pmus;
> struct x86_hybrid_pmu *hybrid_pmu;
> u8 (*get_hybrid_cpu_type) (void);
> +
> + /* Capabilities that needs to be forwarded to pmu->capabilities */
> + int capabilities;
> };
>
> struct x86_perf_task_context_opt {
> diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h
> index d5628a7b5eaa..4459e0918e28 100644
> --- a/include/linux/perf_event.h
> +++ b/include/linux/perf_event.h
> @@ -292,6 +292,7 @@ struct perf_event_pmu_context;
> #define PERF_PMU_CAP_NO_EXCLUDE 0x0080
> #define PERF_PMU_CAP_AUX_OUTPUT 0x0100
> #define PERF_PMU_CAP_EXTENDED_HW_TYPE 0x0200
> +#define PERF_PMU_CAP_FORWARD_EVENT 0x0400
>
> struct perf_output_handle;
>
> diff --git a/kernel/events/core.c b/kernel/events/core.c
> index a5a51dfdd622..c3f59d937280 100644
> --- a/kernel/events/core.c
> +++ b/kernel/events/core.c
> @@ -11633,9 +11633,13 @@ static struct pmu *perf_init_event(struct perf_event *event)
> goto fail;
>
> ret = perf_try_init_event(pmu, event);
> - if (ret == -ENOENT && event->attr.type != type && !extended_type) {
> - type = event->attr.type;
> - goto again;
> + if (ret == -ENOENT) {
> + if (event->attr.type != type && !extended_type) {
> + type = event->attr.type;
> + goto again;
> + }
> + if (pmu->capabilities & PERF_PMU_CAP_FORWARD_EVENT)
> + goto try_all;

Wouldn't it be better to use a different error code to indicate
it's about precise_ip (or forwarding in general)? Otherwise
other invalid config might cause the forwarding unnecessarily..

Thanks,
Namhyung


> }
>
> if (ret)
> @@ -11644,6 +11648,7 @@ static struct pmu *perf_init_event(struct perf_event *event)
> goto unlock;
> }
>
> +try_all:
> list_for_each_entry_rcu(pmu, &pmus, entry, lockdep_is_held(&pmus_srcu)) {
> ret = perf_try_init_event(pmu, event);
> if (!ret)
> --
> 2.39.2
>