Re: [PATCH] af_packet: Handle outgoing VLAN packets without hardware offloading

From: Chengen Du
Date: Thu May 23 2024 - 10:18:35 EST


Hi Willem, all,

Thank you for highlighting the QinQ and L2.5 issues.
These are areas I am not very familiar with, and I appreciate your guidance.

To address the QinQ and L2.5 issues, the third approach seems like a
promising solution.
If I understand correctly, in the QinQ scenario, we need to preserve
the link layer header because it includes two VLAN tags.
For the L2.5 issue, we can adjust by pulling mac_len instead of
skb_network_offset.
In summary, we may need to retain the link layer header to enable
receivers to parse different protocol scenarios.

Although this approach can resolve all issues, it requires the
receiver's cooperation to implement the parsing logic of the link
layer header.
I am concerned that this implementation may take time and adding it
directly into the kernel could place time pressure on existing users.

I would like to propose some ideas and welcome your thoughts on them.
Firstly, I suggest we address the VLAN issue using the second
approach, as it appears to be a bug and can be resolved without
affecting current users.
Secondly, we could introduce the link layer header preservation via a
new packet socket flag.
Receivers could implement and test this by enabling the socket flag.
This way, we avoid disrupting existing receiver behavior while
providing the capability to parse more complex protocols.

This is a rough idea and may require further discussion.
Please share your opinions if you have any concerns.
Your suggestions and assistance are crucial to resolving this issue,
and I greatly appreciate your input.

Best regards,
Chengen Du

On Thu, May 23, 2024 at 3:54 AM <alexandre.ferrieux@xxxxxxxxxx> wrote:
>
> Hi Willem, all.
>
> On 22/05/2024 20:39, Willem de Bruijn wrote:
> > Chengen Du wrote:
> > >
> > > [1] https://github.com/the-tcpdump-group/libpcap/issues/1105
> > > [2] https://marc.info/?l=linux-netdev&m=165074467517201&w=4
> First, a huge thanks to you guys for digging this out, and for taking it on for
> good.
> > This is all super helpful context and will have to make it into the
> > commit message.
> >
> > So if I understand correctly the issue is inconsistency about whether
> > VLAN tags are L2 or L3, and information getting lost along the way.
> Exactly. As you put it, L2.5 is the hot potato nobody wants to handle; each side
> assumes the other will take care of it :)
> > SOCK_DGRAM mode removes everything up to skb_network_offset, which
> > also removes the VLAN tags. But it does not update skb->protocol.
> >
> > msg_name includes sockaddr_ll.sll_protocol which is set to
> > skb->protocol.
> >
> > So the process gets a packet with purported protocol ETH_P_8021Q
> > starting beginning at an IP or IPv6 header.
> >
> > A few alternatives to address this:
> >
> > 1. insert the VLAN tag back into the packet, with an skb_push.
> > 2. prepare the data as if it is a VLAN offloaded packet:
> > pass the VLAN information through PACKET_AUXDATA.
> > 3. pull not up to skb_network_offset, but pull mac_len.
> >
> > Your patch does the second.
> >
> > I think the approach is largely sound, with a few issues to consider:
> > - QinQ. The current solution just passes the protocol in the outer tag
> > - Other L2.5, like MPLS. This solution does not work for those.
> > (if they need a fix, and the same network_offset issue applies.)
> >
> > 3 would solve all these cases, I think. But is a larger diversion from
> > established behavior.
> I had somehow formed the same analysis and list of available options (with the
> difference that, being a kernel newbie, I was 10% sure).
> A few extra things you might consider before making the decision:
>
> (a) If the absolute highest priority goes to backwards compatibility, then
> Chengen's approach (your 2.) is the clear winner.
>
> (b) However, from my standpoint as a protocol-agnostic packet-cruncher
> (capturing all sorts of encapsulations at midpoint in promiscuous mode - none of
> the packets is really meant for the local stack), the very idea of "special
> casing" *one* level of VLAN and burying it inside an obscure, lossy pseudo-L2
> (SLL), is completely alien. So, any setting (be it not the default) that would
> allow to wipe this wart, would be a huge bonus. So I'd happily go with (1) or
> (3.) which both do that. I'll defer to your appreciation of which is the least
> disruptive.
>
> (c) Speaking of backwards compatibility, I would respectfully like to point out
> that a behavior that has been utterly broken for so many years, might qualify
> for a rather decisive correction. You won't break much anyway :)
>
> TL;DR: all three options are enormously better than the statu quo => by all
> means, fix it either way :)
>
> Best regards,
>
> -Alex
>
>
> ____________________________________________________________________________________________________________
> Ce message et ses pieces jointes peuvent contenir des informations confidentielles ou privilegiees et ne doivent donc
> pas etre diffuses, exploites ou copies sans autorisation. Si vous avez recu ce message par erreur, veuillez le signaler
> a l'expediteur et le detruire ainsi que les pieces jointes. Les messages electroniques etant susceptibles d'alteration,
> Orange decline toute responsabilite si ce message a ete altere, deforme ou falsifie. Merci.
>
> This message and its attachments may contain confidential or privileged information that may be protected by law;
> they should not be distributed, used or copied without authorisation.
> If you have received this email in error, please notify the sender and delete this message and its attachments.
> As emails may be altered, Orange is not liable for messages that have been modified, changed or falsified.
> Thank you.