From: Jesper Dangaard Brouer <jbrouer@xxxxxxxxxx>
Date: Mon, 22 May 2023 13:41:43 +0200
On 19/05/2023 18.35, Alexander Lobakin wrote:
From: Jesper Dangaard Brouer <jbrouer@xxxxxxxxxx>
Date: Tue, 16 May 2023 17:35:27 +0200
[...]
Not talking about your changes (in this patch).
I'm realizing that SKBs using metadata area will have a performance hit
due to accessing another cacheline (the meta_len in skb_shared_info).
IIRC Daniel complained about this performance hit (in the past), I guess
this explains it. IIRC Cilium changed to use percpu variables/datastore
to workaround this.
Why should we compare metadata of skbs on GRO anyway? I was disabling it
the old hints series (conditionally, if driver asks), moreover...
...if metadata contains full checksum, GRO will be broken completely due
to this comparison (or any other frame-unique fields. VLAN tags and
hashes are okay).
The whole xdp_metalen_invalid() gets expanded into:
return (metalen % 4) || metalen > 255;
at compile-time. All those typeof shenanigans are only to not open-code
meta_len's type/size/max.
But only use for SKBs that gets created from xdp with metadata, right?
Normal netstack processing actually access this skb_shinfo->meta_len in
gro_list_prepare(). As the caller dev_gro_receive() later access other
memory in skb_shared_info, then the GRO code path already takes this hit
to begin with.
You access skb_shinfo() often even before running XDP program, for
example, when a frame is multi-buffer. Plus HW timestamps are also
there, and so on.