Re: TCP connections dropped

From: Maciej Soltysiak
Date: Mon Sep 22 2003 - 13:07:08 EST


On Mon, 22 Sep 2003, Hassan M. Jafri wrote:

> I am running a parallel program on 170 nodes, with 2 processes on each
> nodes. so 340 total processes. Each process has a TCP connection
> established with every other process. So each process has 339 sockets in
> ESTABLISHED state. The problem occurs when I try to write() on these
> socket. The TCP connection gets dropped for some of the sockets of a few
> processes as soon as they try to write to those socket. This problem,
> however, does not occur, if I reduce the number of processes to less than
> 306 (305 TCP sockets/connections for each process).
>
> Any ideas why connections are getting dropped?
I guess your hosts run out of memory for storing open sockets.
It's propably like synflooding yourself to death.
When I was once doing synfloods tests, I could store only about 170
connections before having the same effect as you: packet drops.
So:
1. Try enabling syncookies (in the kernel _and_ in
/proc/sys/net/ipv4/tcp_syncookies)
2. Try increasing
/proc/sys/net/ipv4/tcp_max_syn/backlog
3. Try reading about other memory related knobs.
/usr/src/linux/Documentation/network/ip-sysctl.txt
4. When reading about syncookies in ip-sysctl, read also:
http://cr.yp.to/syncookies.html
especially the last section called: SYN cookie monsters.

Both of these documents are in contradiction about 'protocol
violation', etc. I wonder, has this been sorted out among Kuznetsov,
Akkerman, Metzger and Bernstein?
Anyway I feel safe with D. J. Bernsteins notes about it.

Regards,
Maciej

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/