Re: commit f5f99309 (sock: do not set sk_err in sock_dequeue_err_skb) has broken ping

From: Soheil Hassas Yeganeh
Date: Thu Jun 01 2017 - 12:43:37 EST


On Thu, Jun 1, 2017 at 11:36 AM, Cyril Hrubis <chrubis@xxxxxxx> wrote:
> It seems to repeatedly produce (until I plug the cable back):
>
> ee_errno = 113 ee_origin = 2 ee_type = 3 ee_code = 1 ee_info = 0 ee_data = 0
>
> So we get EHOSTUNREACH on SO_EE_ORIGIN_ICMP.

Thank you very much! I have a wild guess that, when we
have a train of skbs on the error queue starting from a local error,
we will see this issue.

Ping (without my patch) considers EAGAIN on a normal read as an
indication that there is nothing on the error queue, but that's a
flawed assumption.

Would you mind trying another shot in the darkness please? Thanks!

diff --git a/net/core/skbuff.c b/net/core/skbuff.c
index 5a726161f4e4..097152a03c74 100644
--- a/net/core/skbuff.c
+++ b/net/core/skbuff.c
@@ -3742,7 +3742,8 @@ EXPORT_SYMBOL(sock_queue_err_skb);
static bool is_icmp_err_skb(const struct sk_buff *skb)
{
return skb && (SKB_EXT_ERR(skb)->ee.ee_origin == SO_EE_ORIGIN_ICMP ||
- SKB_EXT_ERR(skb)->ee.ee_origin == SO_EE_ORIGIN_ICMP6);
+ SKB_EXT_ERR(skb)->ee.ee_origin == SO_EE_ORIGIN_ICMP6 ||
+ SKB_EXT_ERR(skb)->ee.ee_origin == SO_EE_ORIGIN_LOCAL);
}

struct sk_buff *sock_dequeue_err_skb(struct sock *sk)
@@ -3751,14 +3752,19 @@ struct sk_buff *sock_dequeue_err_skb(struct sock *sk)
struct sk_buff *skb, *skb_next = NULL;
bool icmp_next = false;
unsigned long flags;
+ int err = 0;

spin_lock_irqsave(&q->lock, flags);
skb = __skb_dequeue(q);
- if (skb && (skb_next = skb_peek(q)))
+ if (skb && (skb_next = skb_peek(q))) {
icmp_next = is_icmp_err_skb(skb_next);
+ err = SKB_EXT_ERR(skb_next)->ee.ee_origin;
+ }
spin_unlock_irqrestore(&q->lock, flags);

- if (is_icmp_err_skb(skb) && !icmp_next)
+ if (icmp_next)
+ sk->sk_err = err;
+ else if (is_icmp_err_skb(skb) && !icmp_next)
sk->sk_err = 0;

if (skb_next)