Re: recv() hangs until SIGCHLD ?

From: Nicolas Cannasse
Date: Mon Oct 13 2008 - 04:31:55 EST


If there is data and the thread didn't wake up then that is a libc or kernel problem;
but if there is no data, then look for cases where earlier interrupted io actually
consumed the data already or blame the sending process not the receiver.
Also are the sockets blocking or non-blocking?

The sockets are non-blocking.

Sorry, I made a spelling mistake here.

I wanted to tell that the sockets ARE blocking (default behavior).

In a practical case, we have a thread blocked in recv() for more than 12 hours, which is way beyond the timeout of the sender connection. The socket has already been closed by the sender so recv() should at least be noticed and returns 0.

To provide more informations :

Doing a lsof on the receiver, we can see that it has several ESTABLISHED sockets connected to a given host/sender. Doing a lsof on the host does not give any socket connected to the receiver (since they have been closed due to a timeout).

Also, the application correctly handles 0.
The pseudo-code is the following :

loop:
ret = recv()
if( ret == -1 ) {
if( errno == EINTR ) goto loop;
return -1;
}
return ret;

Then, on the higher level, in case we get an error ( ret <= 0 ) then we close the socket.

At first, we were using the libmysqlclient but since we had the bug with it we rewrote a mysql client so we can more easily check what's occurring. The same bug seems to occur with both implementations.

Best,
Nicolas
--
To unsubscribe from this list: send the line "unsubscribe linux-net" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html