Re: [PATCH bpf-next 1/5] bpf: block bpf_get_[stack|stackid] on perf_event with PEBS entries

From: Andrii Nakryiko
Date: Sun Jul 12 2020 - 01:07:04 EST


On Fri, Jul 10, 2020 at 11:28 PM Song Liu <songliubraving@xxxxxx> wrote:
>
>
>
> > On Jul 10, 2020, at 8:53 PM, Andrii Nakryiko <andrii.nakryiko@xxxxxxxxx> wrote:
> >
> > On Fri, Jul 10, 2020 at 6:30 PM Song Liu <songliubraving@xxxxxx> wrote:
> >>
> >> Calling get_perf_callchain() on perf_events from PEBS entries may cause
> >> unwinder errors. To fix this issue, the callchain is fetched early. Such
> >> perf_events are marked with __PERF_SAMPLE_CALLCHAIN_EARLY.
> >>
> >> Similarly, calling bpf_get_[stack|stackid] on perf_events from PEBS may
> >> also cause unwinder errors. To fix this, block bpf_get_[stack|stackid] on
> >> these perf_events. Unfortunately, bpf verifier cannot tell whether the
> >> program will be attached to perf_event with PEBS entries. Therefore,
> >> block such programs during ioctl(PERF_EVENT_IOC_SET_BPF).
> >>
> >> Signed-off-by: Song Liu <songliubraving@xxxxxx>
> >> ---
> >
> > Perhaps it's a stupid question, but why bpf_get_stack/bpf_get_stackid
> > can't figure out automatically that they are called from
> > __PERF_SAMPLE_CALLCHAIN_EARLY perf event and use different callchain,
> > if necessary?
> >
> > It is quite suboptimal from a user experience point of view to require
> > two different BPF helpers depending on PEBS or non-PEBS perf events.
>
> I am not aware of an easy way to tell the difference in bpf_get_stack.
> But I do agree that would be much better.
>

Hm... Looking a bit more how all this is tied together in the kernel,
I think it's actually quite easy. So, for perf_event BPF program type:

1. return a special prototype for bpf_get_stack/bpf_get_stackid, which
will have this extra bit of logic for callchain. All other program
types with access to bpf_get_stack/bpf_get_stackid should use the
current one, probably.
2. For that special program, just like for bpf_read_branch_records(),
we know that context is actually `struct bpf_perf_event_data_kern *`,
and it has pt_regs, perf_sample_data and perf_event itself.
3. With that, it seems like you'll have everything you need to
automatically choose a proper callchain.

All this absolutely transparently to the BPF program.

Am I missing something?

> Thanks,
> Song