Re: [PATCH net v2] tcp: check skb is non-NULL in tcp_rto_delta_us()

From: Neal Cardwell
Date: Sun Sep 08 2024 - 13:51:26 EST


On Fri, Sep 6, 2024 at 7:17 PM Josh Hunt <johunt@xxxxxxxxxx> wrote:
>
...
> The NULL ptr deref is coming from tcp_rto_delta_us() attempting to pull an skb
> off the head of the retransmit queue and then dereferencing that skb to get the
> skb_mstamp_ns value via tcp_skb_timestamp_us(skb).
>
> The crash is the same one that was reported a # of years ago here:
> https://lore.kernel.org/netdev/86c0f836-9a7c-438b-d81a-839be45f1f58@xxxxxxxxx/T/#t
>
> and the kernel we're running has the fix which was added to resolve this issue.
>
> Unfortunately we've been unsuccessful so far in reproducing this problem in the
> lab and do not have the luxury of pushing out a new kernel to try and test if
> newer kernels resolve this issue at the moment. I realize this is a report
> against both an Ubuntu kernel and also an older 5.4 kernel. I have reported this
> issue to Ubuntu here: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/2077657
> however I feel like since this issue has possibly cropped up again it makes
> sense to build in some protection in this path (even on the latest kernel
> versions) since the code in question just blindly assumes there's a valid skb
> without testing if it's NULL b/f it looks at the timestamp.
>
> Given we have seen crashes in this path before and now this case it seems like
> we should protect ourselves for when packets_out accounting is incorrect.
> While we should fix that root cause we should also just make sure the skb
> is not NULL before dereferencing it. Also add a warn once here to capture
> some information if/when the problem case is hit again.
>
> Signed-off-by: Josh Hunt <johunt@xxxxxxxxxx>

Since this is targeted to the net branch to fix crashes at least as
far back as 5.4, AFAICT it would be good to have a Fixes: footer, so
maintainers know how far back in stable release history to apply the
fix.

I'd suggest pointing to this linux/v4.13 commit:

Fixes: e1a10ef7fa87 ("tcp: introduce tcp_rto_delta_us() helper for
xmit timer fix")

The bug actually predates that commit (the code before that already
assumed tcp_write_queue_head() was non-NULL in tcp_rearm_rto() if
packets_out is non-zero). But that commit is the first point at which
tcp_rto_delta_us() exists as a function and so it's straightforward to
apply the patch (albeit with some conflicts in earlier kernels). And
that commit is far enough back to imply that the fix should be
backported to all "longterm" releases listed at
https://www.kernel.org/ ...

> ---
>
> v2: Removed cover letter and added context from original cover letter to
> commit msg.
>
> include/net/tcp.h | 15 ++++++++++++---
> 1 file changed, 12 insertions(+), 3 deletions(-)
>
> diff --git a/include/net/tcp.h b/include/net/tcp.h
> index 2aac11e7e1cc..19ea6ed87880 100644
> --- a/include/net/tcp.h
> +++ b/include/net/tcp.h
> @@ -2433,10 +2433,19 @@ void tcp_plb_update_state_upon_rto(struct sock *sk, struct tcp_plb_state *plb);
> static inline s64 tcp_rto_delta_us(const struct sock *sk)
> {
> const struct sk_buff *skb = tcp_rtx_queue_head(sk);
> - u32 rto = inet_csk(sk)->icsk_rto;
> - u64 rto_time_stamp_us = tcp_skb_timestamp_us(skb) + jiffies_to_usecs(rto);
> + u32 rto = jiffies_to_usecs(inet_csk(sk)->icsk_rto);
> +
> + if (likely(skb)) {
> + u64 rto_time_stamp_us = tcp_skb_timestamp_us(skb) + rto;
> +
> + return rto_time_stamp_us - tcp_sk(sk)->tcp_mstamp;
> + } else {
> + WARN_ONCE(1,
> + "rtx queue emtpy: inflight %u tlp_high_seq %u state %u\n",
> + tcp_sk(sk)->packets_out, tcp_sk(sk)->tlp_high_seq, sk->sk_state);
> + return rto;
> + }
>
> - return rto_time_stamp_us - tcp_sk(sk)->tcp_mstamp;
> }
>
> /*
> --

IMHO it would be nice to have the WARN_ONCE print more information, to
help debug these cases. This seems like some sort of packet counting
bug, so IMHO it would be nice to have more information about packet
counts and MTU/MSS (since MTU/MSS changes force recalculation of
packet counts for skbs and the scoreboard, and have thus been a
traditional source of packet-counting bugs). Perhaps something like
the following (compiled but not tested):

+ WARN_ONCE(1,
+ "rtx queue empty: "
+ "out:%u sacked:%u lost:%u retrans:%u "
+ "tlp_high_seq:%u sk_state:%u ca_state:%u "
+ "advmss:%u mss_cache:%u pmtu:%u\n",
+ tcp_sk(sk)->packets_out, tcp_sk(sk)->sacked_out,
+ tcp_sk(sk)->lost_out, tcp_sk(sk)->retrans_out,
+ tcp_sk(sk)->tlp_high_seq, sk->sk_state,
+ inet_csk(sk)->icsk_ca_state,
+ tcp_sk(sk)->advmss, tcp_sk(sk)->mss_cache,
+ inet_csk(sk)->icsk_pmtu_cookie);

Thanks!

neal