Re: [2.6.24.3][net] bug: TCP 3rd handshake abnormal timeouts

From: Willy Tarreau
Date: Sat Mar 15 2008 - 04:56:15 EST


On Sat, Mar 15, 2008 at 09:47:24AM +0100, Gabriel Barazer wrote:
> Hi
>
> Thanks for the netdev Cc, I didn't know where to write to the "network
> guys".

except I just noticed I got it wrong: it's netdev@xxxxxxxxxxxxxxx, and
I omitted the "vger" part. That's what is expected when posting before
caffeine :-)

Feel free to repost the whole issue overthere (along with your new tests)
if you don't get useful replies in a few days.

> By the way thanks for replying. It's hard to explain and describe a
> problem when you know people will ask you hundreds of questions related
> to application-level problems, or not reply because web/mysql problems
> are so common and generally not related to any kernel issue.

What caught my attention was the usual "3s delay", which is purely TCP
and application-independant.

> On 03/15/2008 7:58:49 AM +0100, Willy Tarreau <w@xxxxxx> wrote:
> >
> >You should carefully check the the SYN-ACK received by the client has a
> >correct checksum ("cksum OK" in tcpdump output). It would be possible
> >that for some reason, something on the network randomly corrupts it.
>
> I used to use TCP offloading one time, and by the way never had a
> problem with it. Besides just to be sure, I have been able to reproduce
> the problem without any offload engine enabled (= not compiled into the
> kernel, mainly because it seems to hang the kernel at boot in 2.6.24.3).
> So I assume that is not the problem

OK

> I use wireshark to analyse my pcap files and it says the checksum is
> correct on all packets.

OK

> >Also, you say you have netfilter with conntrack. Is this on the client ?
> >If so, you should try disabling it to rule out any possible bug in the
> >connection tracking.
>
> I have the conntrack on both the client and server, and unfortunately
> can't disable it now on the client (I use it only for the REDIRECT
> target on a precise destination address and port, not MySQL related),
> however I will test today and disable it on the server, after I get some
> sleep (although I think the issue is on the client).

I'm sure it's a client issue too, that's why it would be reasonable to
be able to try without conntrack. Can't you use a TCP proxy instead of
REDIRECT ? Also, you said that you also noticed the same behaviour in
other environments, maybe there you can disable conntrack ?

Willy

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/