Basically the problem is (I'm pretty sure) large numbers of sockets in
TIME_WAIT. The machine runs an "rpc-like" server process, which is
basically a perl daemon that does remote perl calls over sockets. Due
to the braindamaged way this thing is designed, it opens a new socket
for every call. Another bug caused an error to bounce between the
server and the client box, resulting in huge numbers of sockets in
TIME_WAIT. I'm not sure what they were when it died, but a couple of
times I've seen in excess of 900 sockets in TIME_WAIT before killing
everything. I imagine that when it dies the number is much higher.
I'm going to try to reproduce this here (with a different network
card) today. If people give me some suggestions about what info they
want, hopefully I can have a description of the problem and a short
exploit program up today.
Sean
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/