Re: [RFC PATCH 1/1] bpf: Add tunnel decapsulation and GSO state updates per new flags

From: Hudson, Nick

Date: Tue Mar 10 2026 - 12:28:52 EST




> On 25 Feb 2026, at 15:45, Willem de Bruijn <willemdebruijn.kernel@xxxxxxxxx> wrote:
>
> !-------------------------------------------------------------------|
> This Message Is From an External Sender
> This message came from outside your organization.
> |-------------------------------------------------------------------!
>
> Hudson, Nick wrote:
>>
>>
>>> On 20 Feb 2026, at 21:08, Willem de Bruijn <willemdebruijn.kernel@xxxxxxxxx> wrote:
>>>
>>> !-------------------------------------------------------------------|
>>> This Message Is From an External Sender
>>> This message came from outside your organization.
>>> |-------------------------------------------------------------------!
>>>
>>> Nick Hudson wrote:
>>>> Enable BPF programs to properly handle GSO state when decapsulating
>>>> tunneled packets by adding selective GSO flag clearing and a trusted
>>>> mode for GSO handling.
>>>>
>>>> New decapsulation flags:
>>>>
>>>> - BPF_F_ADJ_ROOM_DECAP_L4_UDP: Clear UDP tunnel GSO flags
>>>> (SKB_GSO_UDP_TUNNEL, SKB_GSO_UDP_TUNNEL_CSUM)
>>>> - BPF_F_ADJ_ROOM_DECAP_L4_GRE: Clear GRE tunnel GSO flags
>>>> (SKB_GSO_GRE, SKB_GSO_GRE_CSUM)
>>>> - BPF_F_ADJ_ROOM_DECAP_IPXIP4: Clear SKB_GSO_IPXIP4 flag for
>>>> IPv4-in-IPv4 (IPIP) and IPv6-in-IPv4 (SIT) tunnels
>>>> - BPF_F_ADJ_ROOM_DECAP_IPXIP6: Clear SKB_GSO_IPXIP6 flag for
>>>> IPv6-in-IPv6 and IPv4-in-IPv6 tunnels
>>>> - BPF_F_ADJ_ROOM_NO_DODGY: Preserve gso_segs and don't set
>>>> SKB_GSO_DODGY when the BPF program is trusted and modifications
>>>> are known to be valid
>>>>
>>>> The existing anonymous enum for BPF_FUNC_skb_adjust_room flags is
>>>> renamed to enum bpf_adj_room_flags to enable CO-RE (Compile Once -
>>>> Run Everywhere) lookups in BPF programs.
>>>>
>>>> By default, bpf_skb_adjust_room sets SKB_GSO_DODGY and resets
>>>> gso_segs to 0, forcing revalidation. The NO_DODGY flag bypasses this
>>>> for trusted programs that guarantee GSO correctness.
>>>>
>>>> Usage example (decapsulating UDP tunnel with IPv4 inner packet):
>>>> bpf_skb_adjust_room(skb, -hdr_len, BPF_ADJ_ROOM_NET,
>>>> BPF_F_ADJ_ROOM_DECAP_L3_IPV4 |
>>>> BPF_F_ADJ_ROOM_DECAP_L4_UDP);
>>>
>>> This patch is doing to much in one patch.
>>
>> Sure, I’ll split it up.
>>
>>>
>>> Also not convinced of the need for the NO_DODGY flag.
>>
>> The reason for NO_DODGY is that, without it, the egress interface will see the
>> SKB_GSO_DODGY flag. In our use case, we want to avoid marking the egress tap as
>> NETIF_F_GSO_ROBUST, so the skb will fail skb_gso_ok() with SKB_GSO_DODGY set.
>> When skb_gso_ok() fails, validate_xmit_skb() calls skb_gso_segment().
>
> I understand why you might want it. But the dodgy check has long been
> there for a reason: becauses these transformations are not blindly
> accepted by the kernel. This use case does not change that.

The defence I came up with here is...

- setting NETIF_F_GSO_ROBUST for the tun/tap device, as it is a device level property, affects both host to guest and guest to host. the former is trusted. the latter is not. therefore this is not an option.
- the host to guest direction is fully trusted
- Physical NIC driver is trusted (kernel driver, hardware-validated GSO)
- BPF program is trusted (privileged, CAP_BPF, verified by kernel)
- Decapsulation is trusted operation for BPF code authors
- Bridge + TAP is internal kernel forwarding

Would protecting its use with a sysctl make it acceptable? (If it isn’t still)


Attachment: smime.p7s
Description: S/MIME cryptographic signature