Re: [PATCH bpf-next v3 1/3] xsk: Support UMEM chunk_size > PAGE_SIZE
From: Magnus Karlsson
Date: Fri Apr 21 2023 - 08:18:04 EST
On Fri, 21 Apr 2023 at 11:44, Maciej Fijalkowski
<maciej.fijalkowski@xxxxxxxxx> wrote:
>
> On Tue, Apr 18, 2023 at 01:12:00PM +0200, Kal Cutter Conley wrote:
>
> Hi there,
>
> > > >> In addition, presumably when using this mode, the other XDP actions
> > > >> (XDP_PASS, XDP_REDIRECT to other targets) would stop working unless we
> > > >> add special handling for that in the kernel? We'll definitely need to
> > > >> handle that somehow...
> > > >
> > > > I am not familiar with all the details here. Do you know a reason why
> > > > these cases would stop working / why special handling would be needed?
> > > > For example, if I have a UMEM that uses hugepages and XDP_PASS is
> > > > returned, then the data is just copied into an SKB right? SKBs can
> > > > also be created directly from hugepages AFAIK. So I don't understand
> > > > what the issue would be. Can someone explain this concern?
> > >
> > > Well, I was asking :) It may well be that the SKB path just works; did
> > > you test this? Pretty sure XDP_REDIRECT to another device won't, though?
>
> for XDP_PASS we have to allocate a new buffer and copy the contents from
> current xdp_buff that was backed by xsk_buff_pool and give the current one
> back to pool. I am not sure if __napi_alloc_skb() is always capable of
> handling len > PAGE_SIZE - i believe there might a particular combination
> of settings that allows it, but if not we should have a fallback path that
> would iterate over data and copy this to a certain (linear + frags) parts.
> This implies non-zero effort that is needed for jumbo frames ZC support.
Thinking aloud, could not our multi-buffer work help with this? Sounds
quite similar to operations that we have to do in that patch set. And
if so, would it not be prudent to get the multi-buffer support in
there first, then implement these things on top of that? What do you
think?
> I can certainly test this out and play with it - maybe this just works, I
> didn't check yet. Even if it does, then we need some kind of temporary
> mechanism that will forbid loading ZC jumbo frames due to what Toke
> brought up.
>
> > >
> >
> > I was also asking :-)
> >
> > I tested that the SKB path is usable today with this patch.
> > Specifically, sending and receiving large jumbo packets with AF_XDP
> > and that a non-multi-buffer XDP program could access the whole packet.
> > I have not specifically tested XDP_REDIRECT to another device or
> > anything with ZC since that is not possible without driver support.
> >
> > My feeling is, there wouldn't be non-trivial issues here since this
> > patchset changes nothing except allowing the maximum chunk size to be
> > larger. The driver either supports larger MTUs with XDP enabled or it
> > doesn't. If it doesn't, the frames are dropped anyway. Also, chunk
> > size mismatches between two XSKs (e.g. with XDP_REDIRECT) would be
> > something supported or not supported irrespective of this patchset.
>
> Here is the comparison between multi-buffer and jumbo frames that I did
> for ZC ice driver. Configured MTU was 8192 as this is the frame size for
> aligned mode when working with huge pages. I am presenting plain numbers
> over here from xdpsock.
>
> Mbuf, packet size = 8192 - XDP_PACKET_HEADROOM
> 885,705pps - rxdrop frame_size=4096
> 806,307pps - l2fwd frame_size=4096
> 877,989pps - rxdrop frame_size=2048
> 773,331pps - l2fwd frame_size=2048
>
> Jumbo, packet size = 8192 - XDP_PACKET_HEADROOM
> 893,530pps - rxdrop frame_size=8192
> 841,860pps - l2fwd frame_size=8192
>
> Kal might say that multi-buffer numbers are imaginary as these patches
> were never shown to the public ;) but now that we have extensive test
> suite I am fixing some last issues that stand out, so we are asking for
> some more patience over here... overall i was expecting that they will be
> much worse when compared to jumbo frames, but then again i believe this
> implementation is not ideal and can be improved. Nevertheless, jumbo
> frames support has its value.