Re: [PATCH RESEND net-next] tcp: socket-specific version of WARN_ON_ONCE()
From: Iwashima, Kuniyuki
Date: Tue Nov 29 2022 - 16:16:51 EST
> On Nov 29, 2022, at 21:48, Breno Leitao <leitao@xxxxxxxxxx> wrote:
>> On Tue, Nov 29, 2022 at 10:00:55AM +0900, Kuniyuki Iwashima wrote:
>> From: Breno Leitao <leitao@xxxxxxxxxx>
>> Date: Thu, 24 Nov 2022 03:22:29 -0800
>>> There are cases where we need information about the socket during a
>>> warning, so, it could help us to find bugs that happens and do not have
>>> an easy repro.
>>>
>>> This diff creates a TCP socket-specific version of WARN_ON_ONCE(), which
>>> dumps more information about the TCP socket.
>>>
>>> This new warning is not only useful to give more insight about kernel bugs, but,
>>> it is also helpful to expose information that might be coming from buggy
>>> BPF applications, such as BPF applications that sets invalid
>>> tcp_sock->snd_cwnd values.
>>
>> Have you finally found a root cause on BPF or TCP side ?
>
> Yes, this demonstrated to be very useful to find out BPF applications
> that are doing nasty things with the congestion window.
>
> We currently have this patch applied to Meta's infrastructure to track
> BPF applications that are misbehaving, and easily track down to which
> BPF application is the responsible one.
If you have a fix merged on the BPF side,
it would be helpful to mention the commit to
well understand the issue, background,
and why other tooling is not enough as Paolo wondered.
>>> +#endif /* _LINUX_TCP_DEBUG_H */
>>> diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
>>> index 54836a6b81d6..dd682f60c7cb 100644
>>> --- a/net/ipv4/tcp.c
>>> +++ b/net/ipv4/tcp.c
>>> @@ -4705,6 +4705,36 @@ int tcp_abort(struct sock *sk, int err)
>>> }
>>> EXPORT_SYMBOL_GPL(tcp_abort);
>>>
>>> +void tcp_sock_warn(const struct tcp_sock *tp)
>>> +{
>>> + const struct sock *sk = (const struct sock *)tp;
>>> + struct inet_sock *inet = inet_sk(sk);
>>> + struct inet_connection_sock *icsk = inet_csk(sk);
>>> +
>>> + WARN_ON(1);
>>> +
>>> + if (!tp)
>>
>> Is this needed ?
>
> We are de-referencing tp/sk in the lines below, so, I think it is safe to
> check if they are not NULL before the de-refencing it.
tp->snd_cwnd is accessed just after this WARN,
so I thought there were no cases where tp is NULL.
If it exists, KASAN should be complaining.
I think this additional if could confuse future readers and
want to make sure if there is such a case.
Thank you!
>
> Should I do check for "ck" instead of "tp" to make the code a bit
> cleaner to read?
>
>>> + pr_warn("Socket Info: family=%u state=%d sport=%u dport=%u ccname=%s cwnd=%u",
>>> + sk->sk_family, sk->sk_state, ntohs(inet->inet_sport),
>>> + ntohs(inet->inet_dport), icsk->icsk_ca_ops->name, tcp_snd_cwnd(tp));
>>> +
>>> + switch (sk->sk_family) {
>>> + case AF_INET:
>>> + pr_warn("saddr=%pI4 daddr=%pI4", &inet->inet_saddr,
>>> + &inet->inet_daddr);
>>
>> As with tcp_syn_flood_action(), [address]:port format is easy
>> to read and consistent in kernel ?
>
> Absolutely. I am going to fix it in v2. Thanks!