Re: net: fix routing encapsulated packets when binding a socket to a tunnel interface

From: lifonghsu
Date: Thu Mar 28 2019 - 04:30:42 EST


David Miller æ 2019-03-28 04:44 åå:
From: lifonghsu <lifonghsu@xxxxxxxxxxxx>
Date: Fri, 22 Mar 2019 14:29:58 +0800

From: LiFong Hsu <lifonghsu@xxxxxxxxxxxx>

When binding a socket to a 4in6/6in4 tunnel interface, the kernel sends
the packet to the tunnel interface without any problem, e.g.,
ping 8.8.8.8 -I 4in6. However, after the 4in6/6in4 tunnel encapsulation,
the encapsulated packet could be sent to the tunnel interface again when
some fields of the skb were changed in mangle table's output chain,
such as skb->mark and src/dest IP address. Sending to the tunnel interface
twice is unexpected, since there are no corresponding routing rules on
the tunnel interface for the encapsulated packet. Eventually, the encapsulated
packet will be dropped by the tunnel interface.

This commit stops referring to sk_bound_dev_if while re-routing a packet
with skb_iif!=0 which indicates that the packet has already been sent to
the interface specified by sk_bound_dev_if. Instead, this commit sends
the packet to the underlying network device.

Signed-off-by: LiFong Hsu <lifonghsu@xxxxxxxxxxxx>
Reviewed-by: JianJhen Chen <kchen@xxxxxxxxxxxx>

If you do not respond to my feedback, I am going to drop your patch.

You have 24 hours to respond before that happens.

Sorry for the late reply.

skb->skb_iif is a receive side indication, why would it be changed on transmit?
Indeed, skb_iif is used as receive site indication to present "device the packet arrived on".
This commit keeps the previous arrived device (similar to the concept of "device the packet arrived on") in skb_iif field to prevent kernel from referring sk_bound_dev_if again. Otherwise, we might need to add a new field to sk_buff structure for our purpose.

An example of how this commit works with 4in6 tunnel device is listed as follows:
socket bind 4in6 device (e.g. ping -I 4in6 8.8.8.8)
-> skb is transmitted to 4in6 device by referring sk_bound_dev_if
-> encapsulate the IPv4 packet to an IPv6 packet
-> "this commit sets skb_iif to 4in6 device as a flag"
-> ip6tunnel_xmit
-> re-route the encapsulated IPv6 packet
-> "this commit prevents kernel from referring sk_bound_dev_if again by checking skb_iif, so the encapsulated IPv6 packet will be sent to the underlying device instead of the 4in6 device"
-> underlying device

I see mac802154 doing this, but what it's doing is somewhat broken and that doesn't come into play in your example.

We have not tested mac802154, but we think this commit should be compatible with mac802154. Considering mac802154 is an underlying device in the above mentioned example. The skb_iif will be overwritten by mac802154 at the end. So the behavior of mac802154 should not be affected by this commit.

Thanks.