Re: [PATCH v1 1/3] perf bpf filter: Give terms their own enum

From: Namhyung Kim
Date: Mon May 20 2024 - 16:27:32 EST


On Fri, May 17, 2024 at 8:30 PM Ian Rogers <irogers@xxxxxxxxxx> wrote:
>
> On Fri, May 17, 2024 at 6:36 PM Namhyung Kim <namhyung@xxxxxxxxxx> wrote:
> >
> > On Wed, May 15, 2024 at 9:20 PM Ian Rogers <irogers@xxxxxxxxxx> wrote:
> > >
> > > Give the term types their own enum so that additional terms can be
> > > added that don't correspond to a PERF_SAMPLE_xx flag. The term values
> > > are numerically ascending rather than bit field positions, this means
> > > they need translating to a PERF_SAMPLE_xx bit field in certain places
> > > and they are more densely encoded.
> > >
> > > Signed-off-by: Ian Rogers <irogers@xxxxxxxxxx>
> > > ---
> > [SNIP]
> > > diff --git a/tools/perf/util/bpf_skel/sample_filter.bpf.c b/tools/perf/util/bpf_skel/sample_filter.bpf.c
> > > index fb94f5280626..8666c85e9333 100644
> > > --- a/tools/perf/util/bpf_skel/sample_filter.bpf.c
> > > +++ b/tools/perf/util/bpf_skel/sample_filter.bpf.c
> > > @@ -48,31 +48,50 @@ static inline __u64 perf_get_sample(struct bpf_perf_event_data_kern *kctx,
> > > {
> > > struct perf_sample_data___new *data = (void *)kctx->data;
> > >
> > > - if (!bpf_core_field_exists(data->sample_flags) ||
> > > - (data->sample_flags & entry->flags) == 0)
> > > + if (!bpf_core_field_exists(data->sample_flags))
> > > return 0;
> > >
> > > - switch (entry->flags) {
> > > - case PERF_SAMPLE_IP:
> > > + switch (entry->term) {
> > > + case PBF_TERM_NONE:
> > > + return 0;
> > > + case PBF_TERM_IP:
> > > + if ((data->sample_flags & PERF_SAMPLE_IP) == 0)
> > > + return 0;
> >
> > Can we check this in a single place like in the original code
> > instead of doing it in every case? I think we can keep the
> > entry->flags and check it only if it's non zero. Then uid and
> > gid will have 0 to skip.
>
> I found the old way confusing. As the flags are a bitmap it looks like
> more than one can be set, if that were the case then the switch
> statement would be broken as the case wouldn't exist. Using an enum

The entry->flags is set by the bpf-filter code and it's guaranteed
to have a single bit.

> like this allows warnings to occur when a term is missing in the
> switch statement - which is good when you are adding new terms. I
> think it more obviously matches the man page. We could arrange for the
> enum values to encode the shift position of the flag. Something like:
>
> if ((entry->term > PBF_TERM_NONE && entry->term <= PBF_TERM_DATA_SRC) &&
> (data->sample_flags & (1 << entry->term)) == 0)
> return 0;

Actually I'm ok with enum (and bit shift if needed). I'm just thinking
if it'd be nicer if we can have a check in a single place.

>
> But the problem there is that not every sample type has an enum value,
> and I'm not sure it makes sense for things like STREAM_ID. We could do
> some macro magic to reduce the verbosity like:
>
> #define SAMPLE_CASE(x) \
> case PBF_TERM_##x: \
> if ((data->sample_flags & PERF_SAMPLE_x) == 0) \
> return 0
>
> But I thought that made the code harder to read given the relatively
> small number of sample cases.

I also want to avoid the macro if possible.

Thanks,
Namhyung


> >
> > > return kctx->data->ip;
> > > - case PERF_SAMPLE_ID:
> > > + case PBF_TERM_ID:
> > > + if ((data->sample_flags & PERF_SAMPLE_ID) == 0)
> > > + return 0;
> > > return kctx->data->id;
> > > - case PERF_SAMPLE_TID:
> > > + case PBF_TERM_TID:
> > > + if ((data->sample_flags & PERF_SAMPLE_TID) == 0)
> > > + return 0;
> > > if (entry->part)
> > > return kctx->data->tid_entry.pid;
> > > else
> > > return kctx->data->tid_entry.tid;
> > > - case PERF_SAMPLE_CPU:
> > > + case PBF_TERM_CPU:
> > > + if ((data->sample_flags & PERF_SAMPLE_CPU) == 0)
> > > + return 0;
> > > return kctx->data->cpu_entry.cpu;
> > > - case PERF_SAMPLE_TIME:
> > > + case PBF_TERM_TIME:
> > > + if ((data->sample_flags & PERF_SAMPLE_TIME) == 0)
> > > + return 0;
> > > return kctx->data->time;
> > > - case PERF_SAMPLE_ADDR:
> > > + case PBF_TERM_ADDR:
> > > + if ((data->sample_flags & PERF_SAMPLE_ADDR) == 0)
> > > + return 0;
> > > return kctx->data->addr;
> > > - case PERF_SAMPLE_PERIOD:
> > > + case PBF_TERM_PERIOD:
> > > + if ((data->sample_flags & PERF_SAMPLE_PERIOD) == 0)
> > > + return 0;
> > > return kctx->data->period;
> > > - case PERF_SAMPLE_TRANSACTION:
> > > + case PBF_TERM_TRANSACTION:
> > > + if ((data->sample_flags & PERF_SAMPLE_TRANSACTION) == 0)
> > > + return 0;
> > > return kctx->data->txn;
> > > - case PERF_SAMPLE_WEIGHT_STRUCT:
> > > + case PBF_TERM_WEIGHT_STRUCT:
> > > + if ((data->sample_flags & PERF_SAMPLE_WEIGHT_STRUCT) == 0)
> > > + return 0;
> > > if (entry->part == 1)
> > > return kctx->data->weight.var1_dw;
> > > if (entry->part == 2)
> > > @@ -80,15 +99,25 @@ static inline __u64 perf_get_sample(struct bpf_perf_event_data_kern *kctx,
> > > if (entry->part == 3)
> > > return kctx->data->weight.var3_w;
> > > /* fall through */
> > > - case PERF_SAMPLE_WEIGHT:
> > > + case PBF_TERM_WEIGHT:
> > > + if ((data->sample_flags & PERF_SAMPLE_WEIGHT) == 0)
> > > + return 0;
> > > return kctx->data->weight.full;
> > > - case PERF_SAMPLE_PHYS_ADDR:
> > > + case PBF_TERM_PHYS_ADDR:
> > > + if ((data->sample_flags & PERF_SAMPLE_PHYS_ADDR) == 0)
> > > + return 0;
> > > return kctx->data->phys_addr;
> > > - case PERF_SAMPLE_CODE_PAGE_SIZE:
> > > + case PBF_TERM_CODE_PAGE_SIZE:
> > > + if ((data->sample_flags & PERF_SAMPLE_CODE_PAGE_SIZE) == 0)
> > > + return 0;
> > > return kctx->data->code_page_size;
> > > - case PERF_SAMPLE_DATA_PAGE_SIZE:
> > > + case PBF_TERM_DATA_PAGE_SIZE:
> > > + if ((data->sample_flags & PERF_SAMPLE_DATA_PAGE_SIZE) == 0)
> > > + return 0;
> > > return kctx->data->data_page_size;
> > > - case PERF_SAMPLE_DATA_SRC:
> > > + case PBF_TERM_DATA_SRC:
> > > + if ((data->sample_flags & PERF_SAMPLE_DATA_SRC) == 0)
> > > + return 0;
> > > if (entry->part == 1)
> > > return kctx->data->data_src.mem_op;
> > > if (entry->part == 2)
> > > --
> > > 2.45.0.rc1.225.g2a3ae87e7f-goog
> > >