Re: [PATCH V18 3/9] drivers: perf: arm_pmu: Add infrastructure for branch stack sampling

From: Mark Rutland
Date: Fri Jun 14 2024 - 11:04:45 EST


On Thu, Jun 13, 2024 at 11:47:25AM +0530, Anshuman Khandual wrote:
> @@ -289,6 +289,23 @@ static void armpmu_start(struct perf_event *event, int flags)
> {
> struct arm_pmu *armpmu = to_arm_pmu(event->pmu);
> struct hw_perf_event *hwc = &event->hw;
> + struct pmu_hw_events *cpuc = this_cpu_ptr(armpmu->hw_events);
> + int idx;
> +
> + /*
> + * Merge all branch filter requests from different perf
> + * events being added into this PMU. This includes both
> + * privilege and branch type filters.
> + */
> + if (armpmu->has_branch_stack) {
> + cpuc->branch_sample_type = 0;
> + for (idx = 0; idx < ARMPMU_MAX_HWEVENTS; idx++) {
> + struct perf_event *event_idx = cpuc->events[idx];
> +
> + if (event_idx && has_branch_stack(event_idx))
> + cpuc->branch_sample_type |= event_idx->attr.branch_sample_type;
> + }
> + }

When we spoke about this, I meant that we should do this under armpmu::start(),
or a callee or caller thereof once we know all the events are configured, just
before we actually enable the PMU.

For example, this could live in armv8pmu_branch_enable(), which'd allow
all the actual logic to be added in the BRBE enablement patch.

Doing this in armpmu_start() doesn't work as well because it won't handle
events being removed.

[...]

> diff --git a/include/linux/perf/arm_pmu.h b/include/linux/perf/arm_pmu.h
> index b3b34f6670cf..9eda16dd684e 100644
> --- a/include/linux/perf/arm_pmu.h
> +++ b/include/linux/perf/arm_pmu.h
> @@ -46,6 +46,18 @@ static_assert((PERF_EVENT_FLAG_ARCH & ARMPMU_EVT_63BIT) == ARMPMU_EVT_63BIT);
> }, \
> }
>
> +/*
> + * Maximum branch record entries which could be processed
> + * for core perf branch stack sampling support, regardless
> + * of the hardware support available on a given ARM PMU.
> + */
> +#define MAX_BRANCH_RECORDS 64
> +
> +struct branch_records {
> + struct perf_branch_stack branch_stack;
> + struct perf_branch_entry branch_entries[MAX_BRANCH_RECORDS];
> +};
> +
> /* The events for a given PMU register set. */
> struct pmu_hw_events {
> /*
> @@ -66,6 +78,17 @@ struct pmu_hw_events {
> struct arm_pmu *percpu_pmu;
>
> int irq;
> +
> + struct branch_records *branches;
> +
> + /* Active context for task events */
> + void *branch_context;

Using 'void *' here makes this harder to reason about and hides type
safety issues.

Give this a real type. IIUC it should be 'perf_event_context *'.

> +
> + /* Active events requesting branch records */
> + unsigned int branch_users;
> +
> + /* Active branch sample type filters */
> + unsigned long branch_sample_type;
> };
>
> enum armpmu_attr_groups {
> @@ -96,8 +119,15 @@ struct arm_pmu {
> void (*stop)(struct arm_pmu *);
> void (*reset)(void *);
> int (*map_event)(struct perf_event *event);
> + void (*sched_task)(struct perf_event_pmu_context *pmu_ctx, bool sched_in);
> + bool (*branch_stack_init)(struct perf_event *event);
> + void (*branch_stack_add)(struct perf_event *event, struct pmu_hw_events *cpuc);
> + void (*branch_stack_del)(struct perf_event *event, struct pmu_hw_events *cpuc);
> + void (*branch_stack_reset)(void);

The reset callback isn't used in this series; s

Subsequent patches call armv8pmu_branch_stack_reset() directly from
PMUv3 and the BRBE driver, and arm_pmu::branch_stack_reset() is never
used, so we can delete it.

Mark.