Re: [PATCH for v4.9 LTS 86/87] net: account for current skb length when deciding about UFO

From: Levin, Alexander (Sasha Levin)
Date: Mon Jul 24 2017 - 10:11:31 EST


On Sat, Jul 15, 2017 at 10:53:27AM +0200, Michal Kubecek wrote:
>On Sat, Jul 15, 2017 at 01:26:27AM +0000, Levin, Alexander (Sasha Levin) wrote:
>> From: Michal KubeÄek <mkubecek@xxxxxxx>
>>
>> [ Upstream commit a5cb659bbc1c8644efa0c3138a757a1e432a4880 ]
>>
>> Our customer encountered stuck NFS writes for blocks starting at specific
>> offsets w.r.t. page boundary caused by networking stack sending packets via
>> UFO enabled device with wrong checksum. The problem can be reproduced by
>> composing a long UDP datagram from multiple parts using MSG_MORE flag:
>>
>> sendto(sd, buff, 1000, MSG_MORE, ...);
>> sendto(sd, buff, 1000, MSG_MORE, ...);
>> sendto(sd, buff, 3000, 0, ...);
>>
>> Assume this packet is to be routed via a device with MTU 1500 and
>> NETIF_F_UFO enabled. When second sendto() gets into __ip_append_data(),
>> this condition is tested (among others) to decide whether to call
>> ip_ufo_append_data():
>>
>> ((length + fragheaderlen) > mtu) || (skb && skb_is_gso(skb))
>>
>> At the moment, we already have skb with 1028 bytes of data which is not
>> marked for GSO so that the test is false (fragheaderlen is usually 20).
>> Thus we append second 1000 bytes to this skb without invoking UFO. Third
>> sendto(), however, has sufficient length to trigger the UFO path so that we
>> end up with non-UFO skb followed by a UFO one. Later on, udp_send_skb()
>> uses udp_csum() to calculate the checksum but that assumes all fragments
>> have correct checksum in skb->csum which is not true for UFO fragments.
>>
>> When checking against MTU, we need to add skb->len to length of new segment
>> if we already have a partially filled skb and fragheaderlen only if there
>> isn't one.
>>
>> In the IPv6 case, skb can only be null if this is the first segment so that
>> we have to use headersize (length of the first IPv6 header) rather than
>> fragheaderlen (length of IPv6 header of further fragments) for skb == NULL.
>>
>> Fixes: e89e9cf539a2 ("[IPv4/IPv6]: UFO Scatter-gather approach")
>> Fixes: e4c5e13aa45c ("ipv6: Should use consistent conditional judgement for
>> ip6 fragment between __ip6_append_data and ip6_finish_output")
>> Signed-off-by: Michal Kubecek <mkubecek@xxxxxxx>
>> Acked-by: Vlad Yasevich <vyasevic@xxxxxxxxxx>
>> Signed-off-by: David S. Miller <davem@xxxxxxxxxxxxx>
>> [SL: Drop changes to net/ipv4/ip_output.c because 4.9 doesn't have
>> e89e9cf539a2 ("[IPv4/IPv6]: UFO Scatter-gather approach")]
>
>Commit e89e9cf539a2 is in mainline since v2.6.16.28 so that 4.9 does
>have it. The conflict on cherry-pick is caused by missing commit
>e4c5e13aa45c but that is the same as in the IPv6 part. I'm adding the
>IPv4 part backported to 4.9 below
>
>> Signed-off-by: Sasha Levin <alexander.levin@xxxxxxxxxxx>
>> ---
>> net/ipv6/ip6_output.c | 2 +-
>> 1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/net/ipv6/ip6_output.c b/net/ipv6/ip6_output.c
>> index 5a4b8e7bcedd..9213eba601d0 100644
>> --- a/net/ipv6/ip6_output.c
>> +++ b/net/ipv6/ip6_output.c
>> @@ -1376,7 +1376,7 @@ static int __ip6_append_data(struct sock *sk,
>> */
>>
>> cork->length += length;
>> - if ((((length + fragheaderlen) > mtu) ||
>> + if ((((length + (skb ? skb->len : headersize)) > mtu) ||
>> (skb && skb_is_gso(skb))) &&
>> (sk->sk_protocol == IPPROTO_UDP) &&
>> (rt->dst.dev->features & NETIF_F_UFO) && !rt->dst.header_len &&
>> --
>> 2.11.0
>
>This is strange, it looks as if e4c5e13aa45c was already applied before
>but I can't see that in stable-4.9.y branch.

It was added in this patchset.

>diff --git a/net/ipv4/ip_output.c b/net/ipv4/ip_output.c
>index e5c1dbef3626..06215ba88b93 100644
>--- a/net/ipv4/ip_output.c
>+++ b/net/ipv4/ip_output.c
>@@ -936,7 +936,8 @@ static int __ip_append_data(struct sock *sk,
> csummode = CHECKSUM_PARTIAL;
>
> cork->length += length;
>- if (((length > mtu) || (skb && skb_is_gso(skb))) &&
>+ if ((((length + (skb ? skb->len : fragheaderlen)) > mtu) ||
>+ (skb && skb_is_gso(skb))) &&
> (sk->sk_protocol == IPPROTO_UDP) &&
> (rt->dst.dev->features & NETIF_F_UFO) && !rt->dst.header_len &&
> (sk->sk_type == SOCK_DGRAM) && !sk->sk_no_check_tx) {
>diff --git a/net/ipv6/ip6_output.c b/net/ipv6/ip6_output.c
>index fd649599620e..9213eba601d0 100644
>--- a/net/ipv6/ip6_output.c
>+++ b/net/ipv6/ip6_output.c
>@@ -1376,7 +1376,7 @@ static int __ip6_append_data(struct sock *sk,
> */
>
> cork->length += length;
>- if (((length > mtu) ||
>+ if ((((length + (skb ? skb->len : headersize)) > mtu) ||
> (skb && skb_is_gso(skb))) &&
> (sk->sk_protocol == IPPROTO_UDP) &&
> (rt->dst.dev->features & NETIF_F_UFO) && !rt->dst.header_len &&

Thanks!

--

Thanks,
Sasha