Re: [PATCH] perf/x86/intel: Allow to setup LBR for counting event for BPF

From: Andrii Nakryiko
Date: Mon Sep 09 2024 - 13:10:36 EST


On Mon, Sep 9, 2024 at 8:58 AM <kan.liang@xxxxxxxxxxxxxxx> wrote:
>
> From: Kan Liang <kan.liang@xxxxxxxxxxxxxxx>
>
> The BPF subsystem may capture LBR data on a counting event. However, the
> current implementation assumes that LBR can/should only be used with
> sampling events.
>
> For instance, retsnoop tool ([0]) makes an extensive use of this
> functionality and sets up perf event as follows:
>
> struct perf_event_attr attr;
>
> memset(&attr, 0, sizeof(attr));
> attr.size = sizeof(attr);
> attr.type = PERF_TYPE_HARDWARE;
> attr.config = PERF_COUNT_HW_CPU_CYCLES;
> attr.sample_type = PERF_SAMPLE_BRANCH_STACK;
> attr.branch_sample_type = PERF_SAMPLE_BRANCH_KERNEL;
>
> To limit the LBR for a sampling event is to avoid unnecessary branch
> stack setup for a counting event in the sample read. Because LBR is only
> read in the sampling event's overflow.
>
> Although in most cases LBR is used in sampling, there is no HW limit to
> bind LBR to the sampling mode. Allow an LBR setup for a counting event
> unless in the sample read mode.
>
> Fixes: 85846b27072d ("perf/x86: Add PERF_X86_EVENT_NEEDS_BRANCH_STACK flag")
> Reported-by: Andrii Nakryiko <andrii.nakryiko@xxxxxxxxx>
> Closes: https://lore.kernel.org/lkml/20240905180055.1221620-1-andrii@xxxxxxxxxx/
> Signed-off-by: Kan Liang <kan.liang@xxxxxxxxxxxxxxx>
> Cc: stable@xxxxxxxxxxxxxxx
> ---
> arch/x86/events/intel/core.c | 8 ++++++--
> 1 file changed, 6 insertions(+), 2 deletions(-)
>

LGTM, thanks! Tested and verified that this fixes the issue:

Acked-by: Andrii Nakryiko <andrii@xxxxxxxxxx>
Tested-by: Andrii Nakryiko <andrii@xxxxxxxxxx>

> diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c
> index 605ed19043ed..2b5ff112d8d1 100644
> --- a/arch/x86/events/intel/core.c
> +++ b/arch/x86/events/intel/core.c
> @@ -3981,8 +3981,12 @@ static int intel_pmu_hw_config(struct perf_event *event)
> x86_pmu.pebs_aliases(event);
> }
>
> - if (needs_branch_stack(event) && is_sampling_event(event))
> - event->hw.flags |= PERF_X86_EVENT_NEEDS_BRANCH_STACK;
> + if (needs_branch_stack(event)) {
> + /* Avoid branch stack setup for counting events in SAMPLE READ */
> + if (is_sampling_event(event) ||
> + !(event->attr.sample_type & PERF_SAMPLE_READ))
> + event->hw.flags |= PERF_X86_EVENT_NEEDS_BRANCH_STACK;
> + }
>
> if (branch_sample_counters(event)) {
> struct perf_event *leader, *sibling;
> --
> 2.38.1
>