RE: [PATCH v2 2/2] gro: optimise redundant parsing of packets
From: Willem de Bruijn
Date: Wed Feb 22 2023 - 10:28:04 EST
Richard Gobert wrote:
> Currently the IPv6 extension headers are parsed twice: first in
> ipv6_gro_receive, and then again in ipv6_gro_complete.
>
> By using the new ->transport_proto field, and also storing the size of the
> network header, we can avoid parsing extension headers a second time in
> ipv6_gro_complete (which saves multiple memory dereferences and conditional
> checks inside ipv6_exthdrs_len for a varying amount of extension headers in IPv6
> packets).
>
> The implementation had to handle both inner and outer layers in case of
> encapsulation (as they can't use the same field).
>
> Performance tests for TCP stream over IPv6 with a varying amount of extension
> headers demonstrate throughput improvement of ~0.7%.
>
> In addition, I fixed a potential existing problem:
> - The call to skb_set_inner_network_header at the beginning of
> ipv6_gro_complete calculates inner_network_header based on skb->data by
> calling skb_set_inner_network_header, and setting it to point to the beginning
> of the ip header.
> - If a packet is going to be handled by BIG TCP, the following code block is
> going to shift the packet header, and skb->data is going to be changed as
> well.
>
> When the two flows are combined, inner_network_header will point to the wrong
> place.
>
> The fix is to place the whole encapsulation branch after the BIG TCP code block.
This should be a separate fix patch?
> This way, inner_network_header is calculated with a correct value of skb->data.
> Also, by arranging the code that way, the optimisation does not add an additional
> branch.
>
> Signed-off-by: Richard Gobert <richardbgobert@xxxxxxxxx>
> ---
> include/net/gro.h | 9 +++++++++
> net/ethernet/eth.c | 14 +++++++++++---
> net/ipv6/ip6_offload.c | 20 +++++++++++++++-----
> 3 files changed, 35 insertions(+), 8 deletions(-)
>
> diff --git a/include/net/gro.h b/include/net/gro.h
> index 7b47dd6ce94f..35f60ea99f6c 100644
> --- a/include/net/gro.h
> +++ b/include/net/gro.h
> @@ -86,6 +86,15 @@ struct napi_gro_cb {
>
> /* used to support CHECKSUM_COMPLETE for tunneling protocols */
> __wsum csum;
> +
> + /* Used in ipv6_gro_receive() */
> + u16 network_len;
> +
> + /* Used in eth_gro_receive() */
> + __be16 network_proto;
> +
Why also cache eth->h_proto? That is not mentioned in the commit message.
> + /* Used in ipv6_gro_receive() */
> + u8 transport_proto;