Re: [PATCH v1 01/25] perf sample: Document struct perf_sample

From: Ian Rogers

Date: Fri Mar 20 2026 - 00:41:54 EST


On Mon, Mar 2, 2026 at 7:07 PM Namhyung Kim <namhyung@xxxxxxxxxx> wrote:
>
> On Mon, Feb 09, 2026 at 09:40:08AM -0800, Ian Rogers wrote:
> > Add kernel-doc for struct perf_sample capturing the somewhat unusual
> > population of fields and lifetime relationships.
>
> Thanks for doing this.
>
> >
> > Signed-off-by: Ian Rogers <irogers@xxxxxxxxxx>
> > ---
> > tools/perf/util/sample.h | 90 ++++++++++++++++++++++++++++++++++++++--
> > 1 file changed, 87 insertions(+), 3 deletions(-)
> >
> > diff --git a/tools/perf/util/sample.h b/tools/perf/util/sample.h
> > index 3cce8dd202aa..3fd2a5e01308 100644
> > --- a/tools/perf/util/sample.h
> > +++ b/tools/perf/util/sample.h
> > @@ -81,47 +81,131 @@ struct simd_flags {
> > #define SIMD_OP_FLAGS_PRED_PARTIAL 0x01 /* partial predicate */
> > #define SIMD_OP_FLAGS_PRED_EMPTY 0x02 /* empty predicate */
> >
> > +/**
> > + * struct perf_sample
> > + *
> > + * A sample is generally filled in by evlist__parse_sample/evsel__parse_sample
> > + * which fills in the variables from a "union perf_event *event" which is data
> > + * from a perf ring buffer or perf.data file. The "event" sample is variable in
> > + * length as determined by the perf_event_attr (in the evsel) and details within
> > + * the sample event itself. A struct perf_sample avoids needing to care about
> > + * the variable length nature of the original event.
> > + *
> > + * To avoid being excessively large parts of the struct perf_sample are pointers
> > + * into the original sample event. In general the lifetime of a struct
> > + * perf_sample needs to be less than the "union perf_event *event" it was
> > + * derived from.
> > + *
> > + * The struct regs_dump user_regs and intr_regs are lazily allocated again for
> > + * size reasons, due to them holding a cache of looked up registers. The
> > + * function pair of perf_sample__init and perf_sample__exit correctly initialize
> > + * and clean up these values.
> > + */
> > struct perf_sample {
> > + /** @ip: The sample event PERF_SAMPLE_IP value. */
> > u64 ip;
> > - u32 pid, tid;
> > + /** @pid: The sample event PERF_SAMPLE_TID pid value. */
> > + u32 pid;
> > + /** @tid: The sample event PERF_SAMPLE_TID tid value. */
> > + u32 tid;
> > + /** @time: The sample event PERF_SAMPLE_TIME value. */
> > u64 time;
> > + /** @addr: The sample event PERF_SAMPLE_ADDR value. */
> > u64 addr;
> > + /** @id: The sample event PERF_SAMPLE_ID value. */
> > u64 id;
> > + /** @stream_id: The sample event PERF_SAMPLE_STREAM_ID value. */
> > u64 stream_id;
> > + /** @period: The sample event PERF_SAMPLE_PERIOD value. */
> > u64 period;
> > + /** @weight: Data determined by PERF_SAMPLE_WEIGHT or PERF_SAMPLE_WEIGHT_STRUCT. */
> > u64 weight;
> > + /** @transaction: The sample event PERF_SAMPLE_TRANSACTION value. */
> > u64 transaction;
> > + /** @insn_cnt: Filled in and used by intel-pt. */
> > u64 insn_cnt;
> > + /** @cyc_cnt: Filled in and used by intel-pt. */
> > u64 cyc_cnt;
> > + /** @cpu: The sample event PERF_SAMPLE_CPU value. */
> > u32 cpu;
> > + /**
> > + * @raw_size: The size in bytes of raw data from PERF_SAMPLE_RAW. For
> > + * alignment reasons this should always be a multiple of
> > + * sizeof(u64) + sizeof(u32).
> > + */
> > u32 raw_size;
> > + /** @data_src: The sample event PERF_SAMPLE_DATA_SRC value. */
> > u64 data_src;
> > + /** @phys_addr: The sample event PERF_SAMPLE_PHYS_ADDR value. */
> > u64 phys_addr;
> > + /** @data_page_size: The sample event PERF_SAMPLE_DATA_PAGE_SIZE value. */
> > u64 data_page_size;
> > + /** @code_page_size: The sample event PERF_SAMPLE_CODE_PAGE_SIZE value. */
> > u64 code_page_size;
> > + /** @cgroup: The sample event PERF_SAMPLE_CGROUP value. */
> > u64 cgroup;
> > + /** @flags: Extra flag data from auxiliary events like intel-pt. */
> > u32 flags;
> > + /** @machine_pid: The guest machine pid derived from the sample id. */
> > u32 machine_pid;
> > + /** @vcpu: The guest machine vcpu derived from the sample id. */
> > u32 vcpu;
> > + /** @insn_len: Instruction length from auxiliary events like intel-pt. */
> > u16 insn_len;
>
> Does it control the insn array later?
>
>
> > + /**
> > + * @cpumode: The cpumode from struct perf_event_header misc variable
> > + * masked with CPUMODE_MASK. Gives user, kernel and hypervisor
> > + * information.
> > + */
> > u8 cpumode;
> > + /** @misc: The entire struct perf_event_header misc variable. */
> > u16 misc;
> > + /** @ins_lat: Instruction latency information from auxiliary events like intel-pt. */
> > u16 ins_lat;
>
> I think this is weight2 coming from PERF_SAMPLE_WEIGHT_STRUCT.
>
>
> > /** @weight3: On x86 holds retire_lat, on powerpc holds p_stage_cyc. */
> > u16 weight3;
>
> You may also want to mention it's from WEIGHT_STRUCT.
>
>
> > - bool no_hw_idx; /* No hw_idx collected in branch_stack */
> > - bool deferred_callchain; /* Has deferred user callchains */
> > + /**
> > + * @no_hw_idx: For PERF_SAMPLE_BRANCH_STACK, true when
> > + * PERF_SAMPLE_BRANCH_HW_INDEX isn't set.
> > + */
> > + bool no_hw_idx;
> > + /**
> > + * @deferred_callchain: When processing PERF_SAMPLE_CALLCHAIN a deferred
> > + * user callchain marker was encountered.
> > + */
> > + bool deferred_callchain;
>
> If this is set, then callchain entry is allocated for deferred
> callchains.

So when deferred_callchain is set this isn't true. The callchain
points into the event:
https://web.git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools-next.git/tree/tools/perf/util/evsel.c?h=perf-tools-next#n3384

>
>
> > + /**
> > + * @deferred_cookie: Identifier of the deferred callchain in the later
> > + * PERF_RECORD_CALLCHAIN_DEFERRED event.
> > + */
> > u64 deferred_cookie;
> > + /** @insn: A copy of the sampled instruction filled in by perf_sample__fetch_insn. */
> > char insn[MAX_INSN];
> > + /** @raw_data: Byte aligned pointer into the original event for PERF_SAMPLE_RAW data. */
>
> I think it's 32-bit aligned (or shifted) as it comes right after the
> raw_size (32-bit). As other fields in the event are 64-bit aligned this
> caused some troubles like in perf trace when you access the pointer
> directly and assume it's naturally aligned.
>
>
> > void *raw_data;
> > + /** @callchain: Pointer into the original event for PERF_SAMPLE_CALLCHAIN data. */
> > struct ip_callchain *callchain;
>
> If deferred_callchain is not set, it just points to data in the mmap
> buffer and it should not be freed.

But if deferred_callchain is set this may also be true. Sashiko is
picking up on the inconsistencies. I'll weasel word it by referring to
the sample__merge_deferred_callchain function.

Thanks,
Ian

> > + /** @branch_stack: Pointer into the original event for PERF_SAMPLE_BRANCH_STACK data. */
> > struct branch_stack *branch_stack;
> > + /**
> > + * @branch_stack_cntr: Pointer into the original event for
> > + * PERF_SAMPLE_BRANCH_COUNTERS data.
> > + */
> > u64 *branch_stack_cntr;
> > + /** @user_regs: Values and pointers into the sample for PERF_SAMPLE_REGS_USER. */
> > struct regs_dump *user_regs;
> > + /** @intr_regs: Values and pointers into the sample for PERF_SAMPLE_REGS_INTR. */
> > struct regs_dump *intr_regs;
> > + /** @user_stack: Size and pointer into the sample for PERF_SAMPLE_STACK_USER. */
> > struct stack_dump user_stack;
> > + /** @read: The sample event PERF_SAMPLE_READ counter values. */
>
> The actual format depends on read_format in the event attribute.
>
> Thanks,
> Namhyung
>
>
> > struct sample_read read;
> > + /**
> > + * @aux_sample: Similar to raw data but with a 64-bit size and
> > + * alignment, PERF_SAMPLE_AUX data.
> > + */
> > struct aux_sample aux_sample;
> > + /** @simd_flags: SIMD flag information from ARM SPE auxiliary events. */
> > struct simd_flags simd_flags;
> > };
> >
> > --
> > 2.53.0.239.g8d8fc8a987-goog
> >