Re: [PATCH net-next v2] af-packet: new flag to indicate all csums are good

From: Willem de Bruijn
Date: Thu Jun 04 2020 - 09:49:22 EST


On Thu, Jun 4, 2020 at 5:47 AM Victor Julien <victor@xxxxxxxxxxxx> wrote:
>
> On 02-06-2020 22:18, Willem de Bruijn wrote:
> > On Tue, Jun 2, 2020 at 4:05 PM Victor Julien <victor@xxxxxxxxxxxx> wrote:
> >>
> >> On 02-06-2020 21:38, Willem de Bruijn wrote:
> >>> On Tue, Jun 2, 2020 at 3:22 PM Victor Julien <victor@xxxxxxxxxxxx> wrote:
> >>>>
> >>>> On 02-06-2020 21:03, Willem de Bruijn wrote:
> >>>>> On Tue, Jun 2, 2020 at 2:31 PM Victor Julien <victor@xxxxxxxxxxxx> wrote:
> >>>>>> On 02-06-2020 19:37, Willem de Bruijn wrote:
> >>>>>>> On Tue, Jun 2, 2020 at 1:03 PM Victor Julien <victor@xxxxxxxxxxxx> wrote:
> >>>>>>>>
> >>>>>>>> On 02-06-2020 16:29, Willem de Bruijn wrote:
> >>>>>>>>> On Tue, Jun 2, 2020 at 4:05 AM Victor Julien <victor@xxxxxxxxxxxx> wrote:
> >>>>>>>>>>
> >>>>>>>>>> Introduce a new flag (TP_STATUS_CSUM_UNNECESSARY) to indicate
> >>>>>>>>>> that the driver has completely validated the checksums in the packet.
> >>>>>>>>>>
> >>>>>>>>>> The TP_STATUS_CSUM_UNNECESSARY flag differs from TP_STATUS_CSUM_VALID
> >>>>>>>>>> in that the new flag will only be set if all the layers are valid,
> >>>>>>>>>> while TP_STATUS_CSUM_VALID is set as well if only the IP layer is valid.
> >>>>>>>>>
> >>>>>>>>> transport, not ip checksum.
> >>>>>>>>
> >>>>>>>> Allow me a n00b question: what does transport refer to here? Things like
> >>>>>>>> ethernet? It isn't clear to me from the doc.
> >>>>>>>
> >>>>>>> The TCP/UDP/.. transport protocol checksum.
> >>>>>>
> >>>>>> Hmm that is what I thought originally, but then it didn't seem to work.
> >>>>>> Hence my patch.
> >>>>>>
> >>>>>> However I just redid my testing. I took the example tpacketv3 program
> >>>>>> and added the status flag checks to the 'display()' func:
> >>>>>>
> >>>>>> if (ppd->tp_status & TP_STATUS_CSUM_VALID) {
> >>>>>> printf("TP_STATUS_CSUM_VALID, ");
> >>>>>> }
> >>>>>> if (ppd->tp_status & (1<<8)) {
> >>>>>> printf("TP_STATUS_CSUM_UNNECESSARY, ");
> >>>>>>
> >>>>>> }
> >>>>>>
> >>>>>> Then using scapy sent some packets in 2 variants:
> >>>>>> - default (good csums)
> >>>>>> - deliberately bad csums
> >>>>>> (then also added a few things like ip6 over ip)
> >>>>>>
> >>>>>>
> >>>>>> srp1(Ether()/IP(src="1.2.3.4", dst="5.6.7.8")/IPv6()/TCP(),
> >>>>>> iface="enp1s0") // good csums
> >>>>>>
> >>>>>> srp1(Ether()/IP(src="1.2.3.4", dst="5.6.7.8")/IPv6()/TCP(chksum=1),
> >>>>>> iface="enp1s0") //bad tcp
> >>>>>
> >>>>> Is this a test between two machines? What is the device driver of the
> >>>>> machine receiving and printing the packet? It would be helpful to know
> >>>>> whether this uses CHECKSUM_COMPLETE or CHECKSUM_UNNECESSARY.
> >>>>
> >>>> Yes 2 machines, or actually 2 machines and a VM. The receiving Linux
> >>>> sits in a kvm vm with network pass through and uses the virtio driver
> >>>> (host uses e1000e). Based on a quick 'git grep CHECKSUM_UNNECESSARY'
> >>>> virtio seems to support that.
> >>>>
> >>>> I've done some more tests. In a pcap replay that I know contains packet
> >>>> with bad TCP csums (but good IP csums for those pkts), to a physical
> >>>> host running Ubuntu Linux kernel 5.3:
> >>>>
> >>>> - receiver uses nfp (netronome) driver: TP_STATUS_CSUM_VALID set for
> >>>> every packet, including the bad TCP ones
> >>>> - receiver uses ixgbe driver: TP_STATUS_CSUM_VALID not set for the bad
> >>>> packets.
> >>>
> >>> Great. Thanks a lot for running all these experiments.
> >>>
> >>> We might have to drop the TP_STATUS_CSUM_VALID with CHECKSUM_COMPLETE
> >>> unless skb->csum_valid.
> >>>
> >>> For packets with multiple transport layer checksums,
> >>> CHECKSUM_UNNECESSARY should mean that all have been verified.
> >>>
> >>> I believe that in the case of multiple transport headers, csum_valid
> >>> similarly ensures all checksums up to csum_start are valid. Will need
> >>> to double check.
> >>>
> >>> If so, there probably is no need for a separate new TP_STATUS.
> >>> TP_STATUS_CSUM_VALID is reported only when all checksums are valid.
> >>
> >> So if I understand you correctly the key may be in the call to
> >> `skb_csum_unnecessary`:
> >>
> >> That reads:
> >>
> >> static inline int skb_csum_unnecessary(const struct sk_buff *skb)
> >> {
> >> return ((skb->ip_summed == CHECKSUM_UNNECESSARY) ||
> >> skb->csum_valid ||
> >> (skb->ip_summed == CHECKSUM_PARTIAL &&
> >> skb_checksum_start_offset(skb) >= 0));
> >> }
> >>
> >> But really only the first 2 conditions are reachable
> >
> > .. from this codepath. That function is called in other codepaths as well.
> >
> >> , as we already know
> >> skb->ip_summed is not CHECKSUM_PARTIAL when we call it.
> >>
> >> So our unmodified check is:
> >>
> >> else if (skb->pkt_type != PACKET_OUTGOING &&
> >> (skb->ip_summed == CHECKSUM_COMPLETE ||
> >> skb->ip_summed == CHECKSUM_UNNECESSARY ||
> >> skb->csum_valid))
> >>
> >> Should this become something like:
> >>
> >> else if (skb->pkt_type != PACKET_OUTGOING &&
> >> (skb->ip_summed == CHECKSUM_COMPLETE &&
> >> skb->csum_valid) ||
> >> skb->ip_summed == CHECKSUM_UNNECESSARY)
> >>
> >> Is this what you had in mind?
> >
> > I don't suggest modifying skb_csum_unnecessary probably. Certainly not
> > until I've looked at all other callers of it.
> >
> > But in case of packet sockets, yes, adding that csum_valid check is my
> > first rough approximation.
> >
> > That said, first let's give others more familiar with
> > TP_STATUS_CSUM_VALID some time to comment.
> >
>
> I did some more experiments, on real hw this time. I made the following
> change to 5.7.0 (wasn't brave enough to remote upgrade a box to netnext):
>
> diff --git a/net/packet/af_packet.c b/net/packet/af_packet.c
> index 29bd405adbbd..3afb1913837a 100644
> --- a/net/packet/af_packet.c
> +++ b/net/packet/af_packet.c
> @@ -2216,8 +2216,8 @@ static int tpacket_rcv(struct sk_buff *skb, struct
> net_device *dev,
> if (skb->ip_summed == CHECKSUM_PARTIAL)
> status |= TP_STATUS_CSUMNOTREADY;
> else if (skb->pkt_type != PACKET_OUTGOING &&
> - (skb->ip_summed == CHECKSUM_COMPLETE ||
> - skb_csum_unnecessary(skb)))
> + ((skb->ip_summed == CHECKSUM_COMPLETE &&
> skb->csum_valid) ||
> + skb->ip_summed == CHECKSUM_UNNECESSARY))
> status |= TP_STATUS_CSUM_VALID;
>
> if (snaplen > res)
>
> With this change it seems the TP_STATUS_CSUM_VALID flag is *never* set
> for the nfp driver.

I was mistaken. skb->csum_valid only signals whether the skb->csum
field is initialized. As of commit 573e8fca255a ("net: skb_gro_checksum_*
functions") skb->csum_valid it is always set if CHECKSUM_COMPLETE.
This does not imply that the checksum field in the header is correct.

The checksum field may get checked against the known checksum of
the payload in skb->csum before __netif_receive_skb_core and thus
before packet sockets during GRO when that is enabled. But not
always. Not if the packet gets flushed, for instance, see tcp4_gro_receive.

Commit 662880f44203 ("net: Allow GRO to use and set levels of checksum
unnecessary") indicates that the original assumption in this patch
that CHECKSUM_UNNECESSARY implies all checksums being valid does not
necessarily hold. Drivers are expected to set up skb->csum_level when
they have verified more than just the inner transport header.