Re: tcp_do_sendmsg()

Andi Kleen (ak@muc.de)
23 Mar 1998 18:30:51 +0100


"David S. Miller" <davem@dm.cobaltmicro.com> writes:

> Netscape calls connect(), makes the socket non blocking, waits with
> select(), is waken up for the socket and starts a write(s, "GET
> /url", ..) on the socket.
>
> Ok.
>
> Because the socket is not established yet the kernel returns EAGAIN
> [why not ENOTCONN here, btw?],
>
> I think this is what POSIX mandates, maybe Alan knows if this is the
> case.

I did some research: Solaris 2.3, Linux 2.0 return EAGAIN; HP/UX 9 and
AIX return ENOTCONN. I think ENOTCONN is more logical, but would probably
break many linux programs.

>
> netscape tries again, server never answers -> endless loop. The
> result is that netscape freezes because it never gets back to its
> main select() that dispatches the X11 connection.
>
> I've seen this several times now, and heard from others that it
> happened there too. This does not happen with 2.0.
>
> I can't think of what could possibly be different behavior wise in
> this case between 2.0.x and 2.1.x, except for tcp_poll(). I'll try to
> verify that function today, it might be the problem.

I looked over the code and couldn't find anything obvious. There is
only one strange case:

poll uses this test for not connected:
(1 << sk->state) & ~(TCPF_SYN_SENT|TCPF_SYN_RECV)
sendmsg uses this test (in wait_for_tcp_connect):
(1 << sk->state) & ~(TCPF_ESTABLISHED | TCPF_CLOSE_WAIT))

This doesn't explain the netscape case [netscape is in SYN_SENT, so both
should be true], but is a difference at least. As far as I can see 2.0
uses the same tests though.

There is another candidate for a TCP optimization BTW [I can't do it
currently myself, because I don't have the necessary LAN setup to do
the benchmarks]: the writable test in tcp_poll has a big influence on
the performance of non blocking servers like squid.

-Andi

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu