Re: [PATCH v2 1/3] bpf: Allow NULL buffers in bpf_dynptr_slice(_rw)

From: Jakub Kicinski
Date: Tue Jul 18 2023 - 14:11:12 EST


On Tue, 18 Jul 2023 10:50:14 -0700 Alexei Starovoitov wrote:
> On Tue, Jul 18, 2023 at 10:18 AM Jakub Kicinski <kuba@xxxxxxxxxx> wrote:
> > > you're still missing the point. Pls read the whole patch series.
> >
> > Could you just tell me what the point is then? The "series" is one
> > patch plus some tiny selftests. I don't see any documentation for
> > how dynptrs are supposed to work either.
> >
> > As far as I can grasp this makes the "copy buffer" optional from
> > the kfunc-API perspective (of bpf_dynptr_slice()).
> >
> > > It is _not_ input validation.
> > > skb_copy_bits is a slow path. One extra check doesn't affect
> > > performance at all. So 'fast paths' isn't a valid argument here.
> > > The code is reusing
> > > if (likely(hlen - offset >= len))
> > > return (void *)data + offset;
> > > which _is_ the fast path.
> > >
> > > What you're requesting is to copy paste
> > > the whole __skb_header_pointer into __skb_header_pointer2.
> > > Makes no sense.
> >
> > No, Alexei, the whole point of skb_header_pointer() is to pass
> > the secondary buffer, to make header parsing dependable.
>
> of course. No one argues about that.
>
> > Passing NULL buffer to skb_header_pointer() is absolutely nonsensical.
>
> Quick grep through the code proves you wrong:
> drivers/net/ethernet/broadcom/bnxt/bnxt.c
> __skb_header_pointer(NULL, start, sizeof(*hp), skb->data,
> skb_headlen(skb), NULL);
>
> was done before this patch. It's using __ variant on purpose
> and explicitly passing skb==NULL to exactly trigger that line
> to deliberately avoid the slow path.
>
> Another example:
> drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c
> skb_header_pointer(skb, 0, 0, NULL);
>
> This one I'm not sure about. Looks buggy.

These are both Tx path for setting up offloads, Linux doesn't request
offloads for headers outside of the linear part. The ixgbevf code is
completely pointless, as you say.

In general drivers are rarely a source of high quality code examples.
Having been directly involved in the bugs that lead to the bnxt code
being written - I was so happy that the driver started parsing Tx
packets *at all*, so I wasn't too fussed by the minor problems :(

> > It should *not* be supported. We had enough prod problems with people
> > thinking that the entire header will be in the linear portion.
> > Then either the NIC can't parse the header, someone enables jumbo,
> > disables GRO, adds new HW, adds encap, etc etc and things implode.
>
> I don't see how this is related.
> NULL buffer allows to get a linear pointer and explicitly avoids
> slow path when it's not linear.

Direct packet access via skb->data is there for those who want high
speed 🤷️

> > If you want to support it in BPF that's up to you, but I think it's
> > entirely reasonable for me to request that you don't do such things
> > in general networking code. The function is 5 LoC, so a local BPF
> > copy seems fine. Although I'd suggest skb_header_pointer_misguided()
> > rather than __skb_header_pointer2() as the name :)
>
> If you insist we can, but bnxt is an example that buffer==NULL is
> a useful concept for networking and not bpf specific.
> It also doesn't make "people think the header is linear" any worse.

My worry is that people will think that whether the buffer is needed or
not depends on _their program_, rather than on the underlying platform.
So if it works in testing without the buffer - the buffer must not be
required for their use case.