Re: [PATCH] tcp: check socket state before calling WARN_ON

From: Eric Dumazet
Date: Fri Dec 06 2024 - 04:09:41 EST


On Fri, Dec 6, 2024 at 9:58 AM Youngmin Nam <youngmin.nam@xxxxxxxxxxx> wrote:
>
> On Fri, Dec 06, 2024 at 09:35:32AM +0100, Eric Dumazet wrote:
> > On Fri, Dec 6, 2024 at 6:50 AM Youngmin Nam <youngmin.nam@xxxxxxxxxxx> wrote:
> > >
> > > On Wed, Dec 04, 2024 at 08:13:33AM +0100, Eric Dumazet wrote:
> > > > On Wed, Dec 4, 2024 at 4:35 AM Youngmin Nam <youngmin.nam@xxxxxxxxxxx> wrote:
> > > > >
> > > > > On Tue, Dec 03, 2024 at 06:18:39PM -0800, Jakub Kicinski wrote:
> > > > > > On Tue, 3 Dec 2024 10:34:46 -0500 Neal Cardwell wrote:
> > > > > > > > I have not seen these warnings firing. Neal, have you seen this in the past ?
> > > > > > >
> > > > > > > I can't recall seeing these warnings over the past 5 years or so, and
> > > > > > > (from checking our monitoring) they don't seem to be firing in our
> > > > > > > fleet recently.
> > > > > >
> > > > > > FWIW I see this at Meta on 5.12 kernels, but nothing since.
> > > > > > Could be that one of our workloads is pinned to 5.12.
> > > > > > Youngmin, what's the newest kernel you can repro this on?
> > > > > >
> > > > > Hi Jakub.
> > > > > Thank you for taking an interest in this issue.
> > > > >
> > > > > We've seen this issue since 5.15 kernel.
> > > > > Now, we can see this on 6.6 kernel which is the newest kernel we are running.
> > > >
> > > > The fact that we are processing ACK packets after the write queue has
> > > > been purged would be a serious bug.
> > > >
> > > > Thus the WARN() makes sense to us.
> > > >
> > > > It would be easy to build a packetdrill test. Please do so, then we
> > > > can fix the root cause.
> > > >
> > > > Thank you !
> > > >
> > >
> > > Hi Eric.
> > >
> > > Unfortunately, we are not familiar with the Packetdrill test.
> > > Refering to the official website on Github, I tried to install it on my device.
> > >
> > > Here is what I did on my local machine.
> > >
> > > $ mkdir packetdrill
> > > $ cd packetdrill
> > > $ git clone https://protect2.fireeye.com/v1/url?k=746d28f3-15e63dd6-746ca3bc-74fe485cbff6-e405b48a4881ecfc&q=1&e=ca164227-d8ec-4d3c-bd27-af2d38964105&u=https%3A%2F%2Fgithub.com%2Fgoogle%2Fpacketdrill.git .
> > > $ cd gtests/net/packetdrill/
> > > $./configure
> > > $ make CC=/home/youngmin/Downloads/arm-gnu-toolchain-13.3.rel1-x86_64-aarch64-none-linux-gnu/bin/aarch64-none-linux-gnu-gcc
> > >
> > > $ adb root
> > > $ adb push packetdrill /data/
> > > $ adb shell
> > >
> > > And here is what I did on my device
> > >
> > > erd9955:/data/packetdrill/gtests/net # ./packetdrill/run_all.py -S -v -L -l tcp/
> > > /system/bin/sh: ./packetdrill/run_all.py: No such file or directory
> > >
> > > I'm not sure if this procedure is correct.
> > > Could you help us run the Packetdrill on an Android device ?
> >
> > packetdrill can run anywhere, for instance on your laptop, no need to
> > compile / install it on Android
> >
> > Then you can run single test like
> >
> > # packetdrill gtests/net/tcp/sack/sack-route-refresh-ip-tos.pkt
> >
>
> You mean.. To test an Android device, we need to run packetdrill on laptop, right ?
>
> Laptop(run packetdrill script) <--------------------------> Android device
>
> By the way, how can we test the Android device (DUT) from packetdrill which is running on Laptop?
> I hope you understand that I am aksing this question because we are not familiar with the packetdrill.
> Thanks.

packetdrill does not need to run on a physical DUT, it uses a software
stack : TCP and tun device.

You have a kernel tree, compile it and run a VM, like virtme-ng

vng -bv

We use this to run kernel selftests in which we started adding
packetdrill tests (in recent kernel tree)

./tools/testing/selftests/net/packetdrill/tcp_slow_start_slow-start-ack-per-4pkt.pkt
./tools/testing/selftests/net/packetdrill/tcp_zerocopy_client.pkt
./tools/testing/selftests/net/packetdrill/tcp_zerocopy_batch.pkt
./tools/testing/selftests/net/packetdrill/tcp_slow_start_slow-start-after-win-update.pkt
./tools/testing/selftests/net/packetdrill/tcp_slow_start_slow-start-fq-ack-per-2pkt.pkt
./tools/testing/selftests/net/packetdrill/tcp_zerocopy_maxfrags.pkt
./tools/testing/selftests/net/packetdrill/tcp_inq_server.pkt
./tools/testing/selftests/net/packetdrill/tcp_zerocopy_epoll_exclusive.pkt
./tools/testing/selftests/net/packetdrill/tcp_zerocopy_basic.pkt
./tools/testing/selftests/net/packetdrill/tcp_zerocopy_small.pkt
./tools/testing/selftests/net/packetdrill/tcp_slow_start_slow-start-app-limited-9-packets-out.pkt
./tools/testing/selftests/net/packetdrill/tcp_slow_start_slow-start-ack-per-2pkt.pkt
./tools/testing/selftests/net/packetdrill/tcp_zerocopy_epoll_oneshot.pkt
./tools/testing/selftests/net/packetdrill/tcp_zerocopy_fastopen-server.pkt
./tools/testing/selftests/net/packetdrill/tcp_inq_client.pkt
./tools/testing/selftests/net/packetdrill/tcp_zerocopy_epoll_edge.pkt
./tools/testing/selftests/net/packetdrill/tcp_slow_start_slow-start-app-limited.pkt
./tools/testing/selftests/net/packetdrill/tcp_zerocopy_fastopen-client.pkt
./tools/testing/selftests/net/packetdrill/tcp_zerocopy_closed.pkt
./tools/testing/selftests/net/packetdrill/tcp_slow_start_slow-start-ack-per-1pkt.pkt
./tools/testing/selftests/net/packetdrill/tcp_slow_start_slow-start-after-idle.pkt
./tools/testing/selftests/net/packetdrill/tcp_slow_start_slow-start-ack-per-2pkt-send-5pkt.pkt
./tools/testing/selftests/net/packetdrill/tcp_slow_start_slow-start-ack-per-2pkt-send-6pkt.pkt
./tools/testing/selftests/net/packetdrill/tcp_md5_md5-only-on-client-ack.pkt
./tools/testing/selftests/net/netfilter/packetdrill/conntrack_synack_old.pkt
./tools/testing/selftests/net/netfilter/packetdrill/conntrack_syn_challenge_ack.pkt
./tools/testing/selftests/net/netfilter/packetdrill/conntrack_inexact_rst.pkt
./tools/testing/selftests/net/netfilter/packetdrill/conntrack_synack_reuse.pkt
./tools/testing/selftests/net/netfilter/packetdrill/conntrack_rst_invalid.pkt
./tools/testing/selftests/net/netfilter/packetdrill/conntrack_ack_loss_stall.pkt