Re: [PATCH] af_packet: Handle outgoing VLAN packets without hardware offloading

From: alexandre . ferrieux
Date: Wed May 22 2024 - 15:55:01 EST


Hi Willem, all.

On 22/05/2024 20:39, Willem de Bruijn wrote:
Chengen Du wrote:
>
> [1] https://github.com/the-tcpdump-group/libpcap/issues/1105
> [2] https://marc.info/?l=linux-netdev&m=165074467517201&w=4
First, a huge thanks to you guys for digging this out, and for taking it on for good.
This is all super helpful context and will have to make it into the
commit message.

So if I understand correctly the issue is inconsistency about whether
VLAN tags are L2 or L3, and information getting lost along the way.
Exactly. As you put it, L2.5 is the hot potato nobody wants to handle; each side assumes the other will take care of it :)
SOCK_DGRAM mode removes everything up to skb_network_offset, which
also removes the VLAN tags. But it does not update skb->protocol.

msg_name includes sockaddr_ll.sll_protocol which is set to
skb->protocol.

So the process gets a packet with purported protocol ETH_P_8021Q
starting beginning at an IP or IPv6 header.

A few alternatives to address this:

1. insert the VLAN tag back into the packet, with an skb_push.
2. prepare the data as if it is a VLAN offloaded packet:
pass the VLAN information through PACKET_AUXDATA.
3. pull not up to skb_network_offset, but pull mac_len.

Your patch does the second.

I think the approach is largely sound, with a few issues to consider:
- QinQ. The current solution just passes the protocol in the outer tag
- Other L2.5, like MPLS. This solution does not work for those.
(if they need a fix, and the same network_offset issue applies.)

3 would solve all these cases, I think. But is a larger diversion from
established behavior.
I had somehow formed the same analysis and list of available options (with the difference that, being a kernel newbie, I was 10% sure).
A few extra things you might consider before making the decision:

(a) If the absolute highest priority goes to backwards compatibility, then Chengen's approach (your 2.) is the clear winner.

(b) However, from my standpoint as a protocol-agnostic packet-cruncher (capturing all sorts of encapsulations at midpoint in promiscuous mode - none of the packets is really meant for the local stack), the very idea of "special casing" *one* level of VLAN and burying it inside an obscure, lossy pseudo-L2 (SLL), is completely alien. So, any setting (be it not the default) that would allow to wipe this wart, would be a huge bonus. So I'd happily go with (1.) or (3.) which both do that. I'll defer to your appreciation of which is the least disruptive.

(c) Speaking of backwards compatibility, I would respectfully like to point out that a behavior that has been utterly broken for so many years, might qualify for a rather decisive correction. You won't break much anyway :)

TL;DR: all three options are enormously better than the statu quo => by all means, fix it either way :)

Best regards,

-Alex


____________________________________________________________________________________________________________
Ce message et ses pieces jointes peuvent contenir des informations confidentielles ou privilegiees et ne doivent donc
pas etre diffuses, exploites ou copies sans autorisation. Si vous avez recu ce message par erreur, veuillez le signaler
a l'expediteur et le detruire ainsi que les pieces jointes. Les messages electroniques etant susceptibles d'alteration,
Orange decline toute responsabilite si ce message a ete altere, deforme ou falsifie. Merci.

This message and its attachments may contain confidential or privileged information that may be protected by law;
they should not be distributed, used or copied without authorisation.
If you have received this email in error, please notify the sender and delete this message and its attachments.
As emails may be altered, Orange is not liable for messages that have been modified, changed or falsified.
Thank you.