Re: [PATCH] tcp: check socket state before calling WARN_ON

From: Neal Cardwell
Date: Fri Jan 17 2025 - 10:19:40 EST


On Fri, Jan 17, 2025 at 12:04 AM Youngmin Nam <youngmin.nam@xxxxxxxxxxx> wrote:
>
> > Thanks for all the details! If the ramdump becomes available again at
> > some point, would it be possible to pull out the following values as
> > well:
> >
> > tp->mss_cache
> > inet_csk(sk)->icsk_pmtu_cookie
> > inet_csk(sk)->icsk_ca_state
> >
> > Thanks,
> > neal
> >
>
> Hi Neal. Happy new year.
>
> We are currently trying to capture a tcpdump during the problem situation
> to construct the Packetdrill script. However, this issue does not occur very often.
>
> By the way, we have a full ramdump, so we can provide the information you requested.
>
> tp->packets_out = 0
> tp->sacked_out = 0
> tp->lost_out = 4
> tp->retrans_out = 1
> tcp_is_sack(tp) = 1
> tp->mss_cache = 1428
> inet_csk(sk)->icsk_ca_state = 4
> inet_csk(sk)->icsk_pmtu_cookie = 1500
>
> If you need any specific information from the ramdump, please let me know.

The icsk_ca_state = 4 is interesting, since that's TCP_CA_Loss,
indicating RTO recovery. Perhaps the socket suffered many recurring
timeouts and timed out with ETIMEDOUT,
causing the tcp_write_queue_purge() call that reset packets_out to
0... and then some race happened during the teardown process that
caused another incoming packet to be processed in this resulting
inconsistent state?

Do you have a way to use GDB or a similar tool to print all the fields
of the socket? Like:

(gdb) p *(struct tcp_sock*) some_hex_address_goes_here

?

If so, that could be useful in extracting further hints about what
state this socket is in.

If that's not possible, but a few extra fields are possible, would you
be able to pull out the following:

tp->retrans_stamp
tp->tcp_mstamp
icsk->icsk_retransmits
icsk->icsk_backoff
icsk->icsk_rto

thanks,
neal