Re: [PATCH net v2] net: iptunnel: fix stale transport header after GRE/TEB decap
From: Jiayuan Chen
Date: Thu Apr 23 2026 - 22:05:09 EST
On 4/23/26 4:19 PM, Paolo Abeni wrote:
On 4/19/26 3:01 PM, Jiayuan Chen wrote:
[...]Is it? the transport header is an offset on top of skb->head, pskb_pull
Hi Eric,+662,18 @@ static inline int iptunnel_pull_offloads(struct sk_buff *skb)I do not think this makes sense.
return 0;
}
+static inline void iptunnel_rebuild_transport_header(struct sk_buff *skb)
+{
+ if (!skb_is_gso(skb))
+ return;
+
+ skb->transport_header = (typeof(skb->transport_header))~0U;
+ skb_probe_transport_header(skb);
+
+ if (!skb_transport_header_was_set(skb))
+ skb_gso_reset(skb);
What is a valid case for this packet being processed further?
The buggy packet must be dropped, instead of being mangled like this.
The reproducer builds a gre frame whose inner Ethernet header is
all-zero. Tracing the skb through RX:
1. At GRE decap exit, skb_transport_offset(skb) < 0 is the rule, not the
exception.
It is negative for every packet leaving the tunnel, including perfectly
well-formed inner IPv4 traffic
because the tunnel leaves skb->transport_header at the outer L4 offset while
pskb_pull() has already advanced skb->data past it.
changes head only if the header is not in the linear part (and the
transport offset is already invalid).
Sorry, my wording was imprecise. The point is not that `transport_header`
itself holds a negative value — it does not — but that after GRE processing,
`skb->data` has advanced past the outer L4 while `skb->transport_header`
is never touched, so `skb_transport_offset(skb)` ends up negative.
The negative offset is produced for every packet leaving GRE, not justskb_transport_header_was_set() stays true, so downstreamSo only malformed packets cause trouble, right?
code that trusts that flag now trusts a stale, negative offset.
2. GRO repairs it — but only for protocols it knows.
In dev_gro_receive(), skb->protocol is dispatched through the offload
table. For ETH_P_IP,
inet_gro_receive() calls skb_set_transport_header(skb,
skb_gro_offset(skb)), and the offset
becomes valid again. But for malformed skb, dev_gro_receive just bypass it.
malformed ones. What differs is what happens downstream:
- For well-formed inner IPv4, `inet_gro_receive()` calls
`skb_set_transport_header(skb, skb_gro_offset(skb))` and restores a
valid offset before any consumer observes it.
- For malformed inner frames (e.g. `skb->protocol == ETH_P_802_2 or other `),
`dev_gro_receive()` finds no ptype and just passes the skb through.
The stale negative offset survives into `__netif_receive_skb_core()`.
So the UAF needs both conditions: GRE producing the stale offset *and*
no downstream rescue.
3. Both kinds then reach __netif_receive_skb_core().My take is that you need to address the issue earlier than the current
So the skb that qdisc/tc/BPF segmenters later see has an
invariant violation — _was_set == true but offset < 0 — that the core
layer has no intention of catching for us.
My reading of this is that the tunnel decap path is producing an skb
that doesn't
honor the contract __netif_receive_skb_core() expects from its
producers, and that
it doesn't really make sense to ask GRE to parse or validate the inner
L4 in order
to fix this.
I'm thinking at the end of GRE decap, before handing the skb to
gro_cells_receive(),
call skb_reset_transport_header(skb).
patch, dropping the malformed packets.
/P
Dropping at tunnel decap is a reasonable option, e.g.:
if (unlikely(skb->protocol == htons(ETH_P_802_2) ||
skb->protocol == htons(ETH_P_802_3) ||
....)) {
kfree_skb_reason(skb, SKB_DROP_REASON_...);
return 0;
}
Two concerns about this approach, though:
1.It asks GRE to decide whether an inner L2 frame is "sensible",
which I don't think should be GRE's responsibility — GRE is a
generic L2/L3 tunnel and historically stays agnostic about the
inner payload.
2. More importantly, filtering on ETH_P_802_2 / ETH_P_802_3 only
covers the case where inner h_proto < ETH_P_802_3_MIN. The same
stale-offset condition can also be reached with any inner
ethertype that has no GRO receive callback resetting
transport_header.
In my earlier reply to Eric I suggested calling
skb_reset_transport_header(skb) at the tunnel decap exit instead.
A few reasons I think this is a cleaner fix:
1.It is inner-protocol agnostic — it normalizes the skb regardless
of what the inner ethertype happens to be, so ARP/PPPoE/... are
fixed by the same one-liner.
2.ip_tunnel_rcv() already updates mac_header (via eth_type_trans)
and network_header (ip_tunnel.c:414). transport_header is the
only one of the three left pointing at the outer offset; resetting
it here is completing what the function is already doing for the
other two.
3.Malformed frames that carry it downstream simply fail ptype_base
dispatch and are dropped there, the same way any unknown-ethertype
frame is dropped today.