Re: [PATCH v1 01/13] perf/core: add union to struct perf_branch_entry

From: Michael Ellerman
Date: Fri Sep 10 2021 - 10:16:42 EST


Michael Ellerman <mpe@xxxxxxxxxxxxxx> writes:
> Peter Zijlstra <peterz@xxxxxxxxxxxxx> writes:
>> On Thu, Sep 09, 2021 at 12:56:48AM -0700, Stephane Eranian wrote:
>>> diff --git a/include/uapi/linux/perf_event.h b/include/uapi/linux/perf_event.h
>>> index f92880a15645..eb11f383f4be 100644
>>> --- a/include/uapi/linux/perf_event.h
>>> +++ b/include/uapi/linux/perf_event.h
>>> @@ -1329,13 +1329,18 @@ union perf_mem_data_src {
>>> struct perf_branch_entry {
>>> __u64 from;
>>> __u64 to;
>>> - __u64 mispred:1, /* target mispredicted */
>>> - predicted:1,/* target predicted */
>>> - in_tx:1, /* in transaction */
>>> - abort:1, /* transaction abort */
>>> - cycles:16, /* cycle count to last branch */
>>> - type:4, /* branch type */
>>> - reserved:40;
>>> + union {
>>> + __u64 val; /* to make it easier to clear all fields */
>>> + struct {
>>> + __u64 mispred:1, /* target mispredicted */
>>> + predicted:1,/* target predicted */
>>> + in_tx:1, /* in transaction */
>>> + abort:1, /* transaction abort */
>>> + cycles:16, /* cycle count to last branch */
>>> + type:4, /* branch type */
>>> + reserved:40;
>>> + };
>>> + };
>>> };
>>
>>
>> Hurpmh... all other bitfields have ENDIAN_BITFIELD things except this
>> one. Power folks, could you please have a look?
>
> The bit number of each field changes between big and little endian, but
> as long as kernel and userspace are the same endian, and both only
> access values via the bitfields then it works.
...
>
> It does look like we have a bug in perf tool though, if I take a
> perf.data from a big endian system to a little endian one I don't see
> any of the branch flags decoded. eg:
>
> BE:
>
> 2413132652524 0x1db8 [0x2d0]: PERF_RECORD_SAMPLE(IP, 0x1): 5279/5279: 0xc00000000045c028 period: 923003 addr: 0
> ... branch stack: nr:28
> ..... 0: c00000000045c028 -> c00000000dce7604 0 cycles P 0
>
> LE:
>
> 2413132652524 0x1db8 [0x2d0]: PERF_RECORD_SAMPLE(IP, 0x1): 5279/5279: 0xc00000000045c028 period: 923003 addr: 0
> ... branch stack: nr:28
> ..... 0: c00000000045c028 -> c00000000dce7604 0 cycles 0
> ^
> missing P
>
> I guess we're missing a byte swap somewhere.

Ugh. We _do_ have a byte swap, but we also need a bit swap.

That works for the single bit fields, not sure if it will for the
multi-bit fields.

So that's a bit of a mess :/

cheers