Re: [PATCH v1 01/13] perf/core: add union to struct perf_branch_entry

From: Stephane Eranian
Date: Wed Sep 15 2021 - 02:03:40 EST


Michael,


On Fri, Sep 10, 2021 at 7:16 AM Michael Ellerman <mpe@xxxxxxxxxxxxxx> wrote:
>
> Michael Ellerman <mpe@xxxxxxxxxxxxxx> writes:
> > Peter Zijlstra <peterz@xxxxxxxxxxxxx> writes:
> >> On Thu, Sep 09, 2021 at 12:56:48AM -0700, Stephane Eranian wrote:
> >>> diff --git a/include/uapi/linux/perf_event.h b/include/uapi/linux/perf_event.h
> >>> index f92880a15645..eb11f383f4be 100644
> >>> --- a/include/uapi/linux/perf_event.h
> >>> +++ b/include/uapi/linux/perf_event.h
> >>> @@ -1329,13 +1329,18 @@ union perf_mem_data_src {
> >>> struct perf_branch_entry {
> >>> __u64 from;
> >>> __u64 to;
> >>> - __u64 mispred:1, /* target mispredicted */
> >>> - predicted:1,/* target predicted */
> >>> - in_tx:1, /* in transaction */
> >>> - abort:1, /* transaction abort */
> >>> - cycles:16, /* cycle count to last branch */
> >>> - type:4, /* branch type */
> >>> - reserved:40;
> >>> + union {
> >>> + __u64 val; /* to make it easier to clear all fields */
> >>> + struct {
> >>> + __u64 mispred:1, /* target mispredicted */
> >>> + predicted:1,/* target predicted */
> >>> + in_tx:1, /* in transaction */
> >>> + abort:1, /* transaction abort */
> >>> + cycles:16, /* cycle count to last branch */
> >>> + type:4, /* branch type */
> >>> + reserved:40;
> >>> + };
> >>> + };
> >>> };
> >>
> >>
> >> Hurpmh... all other bitfields have ENDIAN_BITFIELD things except this
> >> one. Power folks, could you please have a look?
> >
> > The bit number of each field changes between big and little endian, but
> > as long as kernel and userspace are the same endian, and both only
> > access values via the bitfields then it works.
> ...
> >
> > It does look like we have a bug in perf tool though, if I take a
> > perf.data from a big endian system to a little endian one I don't see
> > any of the branch flags decoded. eg:
> >
> > BE:
> >
> > 2413132652524 0x1db8 [0x2d0]: PERF_RECORD_SAMPLE(IP, 0x1): 5279/5279: 0xc00000000045c028 period: 923003 addr: 0
> > ... branch stack: nr:28
> > ..... 0: c00000000045c028 -> c00000000dce7604 0 cycles P 0
> >
> > LE:
> >
> > 2413132652524 0x1db8 [0x2d0]: PERF_RECORD_SAMPLE(IP, 0x1): 5279/5279: 0xc00000000045c028 period: 923003 addr: 0
> > ... branch stack: nr:28
> > ..... 0: c00000000045c028 -> c00000000dce7604 0 cycles 0
> > ^
> > missing P
> >
> > I guess we're missing a byte swap somewhere.
>
> Ugh. We _do_ have a byte swap, but we also need a bit swap.
>
> That works for the single bit fields, not sure if it will for the
> multi-bit fields.
>
> So that's a bit of a mess :/
>
Based on what I see in perf_event.h for other structures, I think I
can make up what you would need for struct branch_entry. But Iit would
be easier if you could send me a patch that you would have verified on
your systems.
Thanks.