Re: [PATCH bpf-next] bpf: Add bpf_read_raw_record() helper

From: Namhyung Kim
Date: Fri Aug 26 2022 - 15:22:03 EST


On Fri, Aug 26, 2022 at 11:09 AM Song Liu <songliubraving@xxxxxx> wrote:
>
>
>
> > On Aug 26, 2022, at 9:33 AM, Namhyung Kim <namhyung@xxxxxxxxxx> wrote:
> >
> > On Thu, Aug 25, 2022 at 10:53 PM Song Liu <song@xxxxxxxxxx> wrote:
> >>
> >> On Thu, Aug 25, 2022 at 10:22 PM Namhyung Kim <namhyung@xxxxxxxxxx> wrote:
> >>>
> >>> On Thu, Aug 25, 2022 at 7:35 PM Song Liu <songliubraving@xxxxxx> wrote:
> >>>> Actually, since we are on this, can we make it more generic, and handle
> >>>> all possible PERF_SAMPLE_* (in enum perf_event_sample_format)? Something
> >>>> like:
> >>>>
> >>>> long bpf_perf_event_read_sample(void *ctx, void *buf, u64 size, u64 flags);
> >>>>
> >>>> WDYT Namhyung?
> >>>
> >>> Do you mean reading the whole sample data at once?
> >>> Then it needs to parse the sample data format properly
> >>> which is non trivial due to a number of variable length
> >>> fields like callchains and branch stack, etc.
> >>>
> >>> Also I'm afraid I might need event configuration info
> >>> other than sample data like attr.type, attr.config,
> >>> attr.sample_type and so on.
> >>>
> >>> Hmm.. maybe we can add it to the ctx directly like ctx.attr_type?
> >>
> >> The user should have access to the perf_event_attr used to
> >> create the event. This is also available in ctx->event->attr.
> >
> > Do you mean from BPF? I'd like to have a generic BPF program
> > that can handle various filtering according to the command line
> > arguments. I'm not sure but it might do something differently
> > for each event based on the attr settings.
>
> Yeah, we can access perf_event_attr from BPF program. Note that
> the ctx for perf_event bpf program is struct bpf_perf_event_data_kern:
>
> SEC("perf_event")
> int perf_e(struct bpf_perf_event_data_kern *ctx)
> {
> ...
> }
>
> struct bpf_perf_event_data_kern {
> bpf_user_pt_regs_t *regs;
> struct perf_sample_data *data;
> struct perf_event *event;
> };

I didn't know that it's allowed to access the kernel data directly.
For some reason, I thought it should use fields in bpf_event_event_data
only, like sample_period and addr. And the verifier will convert the
access to them according to pe_prog_convert_ctx_access().

>
> Alternatively, we can also have bpf user space configure the BPF
> program via a few knobs.
>
> And actually, we can just read ctx->data and get the raw record,
> right..?

If it's possible, sure, it'd be more powerful.

Thanks,
Namhyung