Re: [PATCH bpf-next 1/3] perf: enable branch record for software events
From: Peter Zijlstra
Date: Wed Aug 25 2021 - 08:10:30 EST
On Mon, Aug 23, 2021 at 11:01:55PM -0700, Song Liu wrote:
> arch/x86/events/intel/core.c | 5 ++++-
> arch/x86/events/intel/lbr.c | 12 ++++++++++++
> arch/x86/events/perf_event.h | 2 ++
> include/linux/perf_event.h | 33 +++++++++++++++++++++++++++++++++
> kernel/events/core.c | 28 ++++++++++++++++++++++++++++
> 5 files changed, 79 insertions(+), 1 deletion(-)
No PowerPC support :/
> +void intel_pmu_snapshot_branch_stack(void)
> +{
> + struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
> +
> + intel_pmu_lbr_disable_all();
> + intel_pmu_lbr_read();
> + memcpy(this_cpu_ptr(&perf_branch_snapshot_entries), cpuc->lbr_entries,
> + sizeof(struct perf_branch_entry) * x86_pmu.lbr_nr);
> + *this_cpu_ptr(&perf_branch_snapshot_size) = x86_pmu.lbr_nr;
> + intel_pmu_lbr_enable_all(false);
> +}
Still has the layering violation and issues vs PMI.
> +#ifdef CONFIG_HAVE_STATIC_CALL
> +DECLARE_STATIC_CALL(perf_snapshot_branch_stack,
> + perf_default_snapshot_branch_stack);
> +#else
> +extern void (*perf_snapshot_branch_stack)(void);
> +#endif
That's weird, static call should work unconditionally, and fall back to
a regular function pointer exactly like you do here. Search for:
"Generic Implementation" in include/linux/static_call.h
> diff --git a/kernel/events/core.c b/kernel/events/core.c
> index 011cc5069b7ba..b42cc20451709 100644
> --- a/kernel/events/core.c
> +++ b/kernel/events/core.c
> +#ifdef CONFIG_HAVE_STATIC_CALL
> +DEFINE_STATIC_CALL(perf_snapshot_branch_stack,
> + perf_default_snapshot_branch_stack);
> +#else
> +void (*perf_snapshot_branch_stack)(void) = perf_default_snapshot_branch_stack;
> +#endif
Idem.
Something like:
DEFINE_STATIC_CALL_NULL(perf_snapshot_branch_stack, void (*)(void));
with usage like: static_call_cond(perf_snapshot_branch_stack)();
Should unconditionally work.
> +int perf_read_branch_snapshot(void *buf, size_t len)
> +{
> + int cnt;
> +
> + memcpy(buf, *this_cpu_ptr(&perf_branch_snapshot_entries),
> + min_t(u32, (u32)len,
> + sizeof(struct perf_branch_entry) * MAX_BRANCH_SNAPSHOT));
> + cnt = *this_cpu_ptr(&perf_branch_snapshot_size);
> +
> + return (cnt > 0) ? cnt : -EOPNOTSUPP;
> +}
Doesn't seem used at all..