Re: [PATCH V2 3/7] arm64/perf: Update struct pmu_hw_events for BRBE

From: Anshuman Khandual
Date: Tue Sep 13 2022 - 01:34:01 EST




On 9/12/22 15:42, Mark Brown wrote:
> On Thu, Sep 08, 2022 at 10:40:42AM +0530, Anshuman Khandual wrote:
>
>> + /* Captured BRBE buffer - copied as is into perf_sample_data */
>> + struct perf_branch_stack brbe_stack;
>> + struct perf_branch_entry brbe_entries[BRBE_MAX_ENTRIES];
>
> It looks like perf_branch_entry is intended to be the variably
> sized entries array at the end of perf_branch_stack? That could

That is right. Because max number of entries for brbe_entries[] array
is platform dependent i.e BHRB_MAX_ENTRIES on powerpc, MAX_LBR_ENTRIES
on x86 and BRBE_MAX_ENTRIES on arm64.

The generic definition

struct perf_branch_stack {
__u64 nr;
__u64 hw_idx;
struct perf_branch_entry entries[];
};

On x86 platform

#define MAX_LBR_ENTRIES 32

struct cpu_hw_events {
....
struct perf_branch_stack lbr_stack;
struct perf_branch_entry lbr_entries[MAX_LBR_ENTRIES];
....
}

On powerpc platform

#define BHRB_MAX_ENTRIES 32

struct cpu_hw_events {
....
struct perf_branch_stack bhrb_stack;
struct perf_branch_entry bhrb_entries[BHRB_MAX_ENTRIES];
....
}

Followed same format on arm64 platform as well

#define BRBE_MAX_ENTRIES 64

struct pmu_hw_events {
....
....
struct perf_branch_stack brbe_stack;
struct perf_branch_entry brbe_entries[BRBE_MAX_ENTRIES];
....
....
}

> probably do with being called out if it's the case. It feels

Right, we could add a comment in this regard.

> like it would be clearer and safer to allocate these dynamically
> when BRBE is used if that's possible, I'd expect that should also
> deal with the stack frame size issues as well.

That might not be possible because the generic 'struct perf_branch_stack'
expects 'perf_branch_stack.entries' to be a variable array which is also
contiguous in memory, with other elements in 'perf_branch_stack'. Besides
that will be a deviation from similar implementations on x86 and powerpc
platforms.

The stack frame size came up because BRBE_MAX_ENTRIES is 64 compared to
just 32 on other platforms, which follow the exact same method.