Re: [PATCH] tools/testing/selftests/bpf/test_tc_tunnel.sh: Prevent client connect before server bind

From: Alessandro Carminati
Date: Sun Mar 10 2024 - 05:46:26 EST


Hi Martin,
Thanks for the review.

Il giorno ven 8 mar 2024 alle ore 02:03 Martin KaFai Lau
<martin.lau@xxxxxxxxx> ha scritto:
>
> On 2/29/24 6:00 AM, Alessandro Carminati (Red Hat) wrote:
> > In some systems, the netcat server can incur in delay to start listening.
> > When this happens, the test can randomly fail in various points.
> > This is an example error message:
> > # ip gre none gso
> > # encap 192.168.1.1 to 192.168.1.2, type gre, mac none len 2000
> > # test basic connectivity
> > # Ncat: Connection refused.
>
> This explained what is the issue. Please also explain how the patch solves it.
>
The issue, as stated, depends on a race condition between the netcat client
and server. The test author addressed this problem using a sleep, which I
removed in this patch. To easily solve the issue, one could simply increase
the sleep duration. However, this patch opts to tackle the problem by
querying the /proc directory and verifying TCP binds at the specified port
before letting the client connect.
> >
> > Signed-off-by: Alessandro Carminati (Red Hat) <alessandro.carminati@xxxxxxxxx>
> > ---
> > tools/testing/selftests/bpf/test_tc_tunnel.sh | 19 ++++++++++++++++++-
> > 1 file changed, 18 insertions(+), 1 deletion(-)
> >
> > diff --git a/tools/testing/selftests/bpf/test_tc_tunnel.sh b/tools/testing/selftests/bpf/test_tc_tunnel.sh
> > index 910044f08908..01c0f4b1a8c2 100755
> > --- a/tools/testing/selftests/bpf/test_tc_tunnel.sh
> > +++ b/tools/testing/selftests/bpf/test_tc_tunnel.sh
> > @@ -72,7 +72,6 @@ cleanup() {
> > server_listen() {
> > ip netns exec "${ns2}" nc "${netcat_opt}" -l "${port}" > "${outfile}" &
> > server_pid=$!
> > - sleep 0.2
> > }
> >
> > client_connect() {
> > @@ -93,6 +92,22 @@ verify_data() {
> > fi
> > }
> >
> > +wait_for_port() {
> > + local digits=8
> > + local port2check=$(printf ":%04X" $1)
> > + local prot=$([ "$2" == "-6" ] && echo 6 && digits=32)
> > +
> > + for i in $(seq 20); do
> > + if ip netns exec "${ns2}" cat /proc/net/tcp${prot} | \
> > + sed -r 's/^[ \t]+[0-9]+: ([0-9A-F]{'${digits}'}:[0-9A-F]{4}) .*$/\1/' | \
> > + grep -q "${port2check}"; then
>
> The idea is to check if there is socket listening on port 8888?
>
> May be something simpler like "ss -OHtl src :$1" instead?
Indeed, the aim is to ensure that the server is bound before the
client attempts to
connect by checking if socket is listening, and yes using 'ss' would be shorter.
However, I chose not to use 'ss' or 'netstat' to avoid adding new dependencies,
considering they are already many.
Nonetheless, 'ss' is preferred, I have no objections.
>
> --
> pw-bot: cr
>
> The check-and-wait fix in this patch is fine to get your test environment going.
>
> Eventually, it will be good to see the test_tc_tunnel.sh test moved to
> test_progs. The test_tc_tunnel.sh is not run by bpf CI and issue like this got
> unnoticed. Some other "*.sh" tests have already been moved to test_progs.
>
>
Regards
Alessandro