Re: [PATCH v6 RESEND net-next 1/5] net: enetc: add Rx checksum offload for i.MX95 ENETC

From: Ido Schimmel
Date: Sun Dec 08 2024 - 10:47:43 EST


On Fri, Dec 06, 2024 at 12:45:02PM +0000, Wei Fang wrote:
> > -----Original Message-----
> > From: Simon Horman <horms@xxxxxxxxxx>
> > Sent: 2024年12月6日 20:31
> > To: Wei Fang <wei.fang@xxxxxxx>
> > Cc: Claudiu Manoil <claudiu.manoil@xxxxxxx>; Vladimir Oltean
> > <vladimir.oltean@xxxxxxx>; Clark Wang <xiaoning.wang@xxxxxxx>;
> > andrew+netdev@xxxxxxx; davem@xxxxxxxxxxxxx; edumazet@xxxxxxxxxx;
> > kuba@xxxxxxxxxx; pabeni@xxxxxxxxxx; Frank Li <frank.li@xxxxxxx>;
> > netdev@xxxxxxxxxxxxxxx; linux-kernel@xxxxxxxxxxxxxxx; imx@xxxxxxxxxxxxxxx
> > Subject: Re: [PATCH v6 RESEND net-next 1/5] net: enetc: add Rx checksum
> > offload for i.MX95 ENETC
> >
> > On Fri, Dec 06, 2024 at 10:33:15AM +0000, Wei Fang wrote:
> > > > -----Original Message-----
> > > > From: Simon Horman <horms@xxxxxxxxxx>
> > > > Sent: 2024年12月6日 17:23
> > > > To: Wei Fang <wei.fang@xxxxxxx>
> > > > Cc: Claudiu Manoil <claudiu.manoil@xxxxxxx>; Vladimir Oltean
> > > > <vladimir.oltean@xxxxxxx>; Clark Wang <xiaoning.wang@xxxxxxx>;
> > > > andrew+netdev@xxxxxxx; davem@xxxxxxxxxxxxx; edumazet@xxxxxxxxxx;
> > > > kuba@xxxxxxxxxx; pabeni@xxxxxxxxxx; Frank Li <frank.li@xxxxxxx>;
> > > > netdev@xxxxxxxxxxxxxxx; linux-kernel@xxxxxxxxxxxxxxx; imx@xxxxxxxxxxxxxxx
> > > > Subject: Re: [PATCH v6 RESEND net-next 1/5] net: enetc: add Rx checksum
> > > > offload for i.MX95 ENETC
> > > >
> > > > On Wed, Dec 04, 2024 at 01:29:28PM +0800, Wei Fang wrote:
> > > > > ENETC rev 4.1 supports TCP and UDP checksum offload for receive, the bit
> > > > > 108 of the Rx BD will be set if the TCP/UDP checksum is correct. Since
> > > > > this capability is not defined in register, the rx_csum bit is added to
> > > > > struct enetc_drvdata to indicate whether the device supports Rx
> > checksum
> > > > > offload.
> > > > >
> > > > > Signed-off-by: Wei Fang <wei.fang@xxxxxxx>
> > > > > Reviewed-by: Frank Li <Frank.Li@xxxxxxx>
> > > > > Reviewed-by: Claudiu Manoil <claudiu.manoil@xxxxxxx>
> > > > > ---
> > > > > v2: no changes
> > > > > v3: no changes
> > > > > v4: no changes
> > > > > v5: no changes
> > > > > v6: no changes
> > > > > ---
> > > > > drivers/net/ethernet/freescale/enetc/enetc.c | 14
> > ++++++++++----
> > > > > drivers/net/ethernet/freescale/enetc/enetc.h | 2 ++
> > > > > drivers/net/ethernet/freescale/enetc/enetc_hw.h | 2 ++
> > > > > .../net/ethernet/freescale/enetc/enetc_pf_common.c | 3 +++
> > > > > 4 files changed, 17 insertions(+), 4 deletions(-)
> > > > >
> > > > > diff --git a/drivers/net/ethernet/freescale/enetc/enetc.c
> > > > b/drivers/net/ethernet/freescale/enetc/enetc.c
> > > > > index 35634c516e26..3137b6ee62d3 100644
> > > > > --- a/drivers/net/ethernet/freescale/enetc/enetc.c
> > > > > +++ b/drivers/net/ethernet/freescale/enetc/enetc.c
> > > > > @@ -1011,10 +1011,15 @@ static void enetc_get_offloads(struct
> > enetc_bdr
> > > > *rx_ring,
> > > > >
> > > > > /* TODO: hashing */
> > > > > if (rx_ring->ndev->features & NETIF_F_RXCSUM) {
> > > > > - u16 inet_csum = le16_to_cpu(rxbd->r.inet_csum);
> > > > > -
> > > > > - skb->csum = csum_unfold((__force
> > __sum16)~htons(inet_csum));
> > > > > - skb->ip_summed = CHECKSUM_COMPLETE;
> > > > > + if (priv->active_offloads & ENETC_F_RXCSUM &&
> > > > > + le16_to_cpu(rxbd->r.flags) &
> > ENETC_RXBD_FLAG_L4_CSUM_OK)
> > > > {
> > > > > + skb->ip_summed = CHECKSUM_UNNECESSARY;
> > > > > + } else {
> > > > > + u16 inet_csum = le16_to_cpu(rxbd->r.inet_csum);
> > > > > +
> > > > > + skb->csum = csum_unfold((__force
> > __sum16)~htons(inet_csum));
> > > > > + skb->ip_summed = CHECKSUM_COMPLETE;
> > > > > + }
> > > > > }
> > > >
> > > > Hi Wei,
> > > >
> > > > I am wondering about the relationship between the above and
> > > > hardware support for CHECKSUM_COMPLETE.
> > > >
> > > > Prior to this patch CHECKSUM_COMPLETE was always used, which seems
> > > > desirable. But with this patch, CHECKSUM_UNNECESSARY is conditionally
> > used.
> > > >
> > > > If those cases don't work with CHECKSUM_COMPLETE then is this a
> > bug-fix?
> > > >
> > > > Or, alternatively, if those cases do work with CHECKSUM_COMPLETE, then
> > > > I'm unsure why this change is necessary or desirable. It's my understanding
> > > > that from the Kernel's perspective CHECKSUM_COMPLETE is preferable to
> > > > CHECKSUM_UNNECESSARY.
> > > >
> > > > ...
> > >
> > > Rx checksum offload is a new feature of ENETC v4. We would like to exploit
> > this
> > > capability of the hardware to save CPU cycles in calculating and verifying
> > checksum.
> > >
> >
> > Understood, but CHECKSUM_UNNECESSARY is usually the preferred option as
> > it
> > is more flexible, e.g. allowing low-cost calculation of inner checksums
> > in the presence of encapsulation.
>
> I think you mean 'CHECKSUM_COMPLETE' is the preferred option. But there is no
> strong reason against using CHECKSUM_UNNECESSARY. So I hope to keep this patch.

I was also under the impression that CHECKSUM_COMPLETE is more desirable
than CHECKSUM_UNNECESSARY. Maybe Tom can help.

Tom:

If a device can report both CHECKSUM_UNNECESSARY and CHECKSUM_COMPLETE,
is there any advantage in reporting CHECKSUM_UNNECESSARY? The only
advantage I can think of is that when the kernel pulls headers (IPv6 for
example) it wouldn't need to compute their checksum in order to adjust
skb->csum, but I am not sure how critical that is.

I am asking because I am interested in knowing what is the
recommendation for future devices: Implement both or only
CHECKSUM_COMPLETE?

Original patch is here [1] and I did read your paper [2] and David's
presentation [3].

Thanks

[1] https://lore.kernel.org/netdev/20241204052932.112446-1-wei.fang@xxxxxxx/T/#mf89bb4c6c72e8dd4a697551cbc9485217366d013
[2] https://people.netfilter.org/pablo/netdev0.1/papers/UDP-Encapsulation-in-Linux.pdf
[3] https://www.netdevconf.info/1.1/proceedings/slides/miller-hardware-checksumming.pdf