Re: Solaris 100K TCP connections, good example? was:[Fwd: [Fwd:

David S. Miller (davem@redhat.com)
Tue, 28 Sep 1999 09:32:17 -0700

Messages sorted by: [ date ][ thread ][ subject ][ author ]
Next message: Bradley D. LaRonde: "SA_INTERRUPT"
Previous message: Jens Axboe: "Re: Pioneer DVD 114 vs. 2.2.5 ide driver"

Date: Tue, 28 Sep 1999 11:52:16 -0400 (EDT)
From: Chuck Lever <cel@monkey.org>

it may have made sense a while ago to create an inode for each
socket before there were dentries, and before you could create more
than 4K sockets per system. nowadays i think it is wasteful and
introduces DoS and system instability where one might expect normal
behavior. administrators need to be aware of how to set their fd
and socket maximums, and what can happen if they don't set them
appropriately -- this is now a security issue.

You make some valid points, certainly, but here is some food
for thought:

1) Some sockets do need a bonafide inode attached to them,
for example UNIX domain sockets bound to a name in the filesystem
space.

2) In my experiments over a year ago, I do recall one statistic, that
even with all of this overhead the kernel could manage ~40,000
full sockets with ~64MB of ram. This was on a 64-bit system.

3) Most "high end, high connection rate" systems are not full of
established state full sockets, no, but rather they are consumed
mostly by sockets in TIME_WAIT. This case we have in fact
optimized, we throw away the inode/dentry/file when we enter
this state and even use a miniature version of "struct sock" so
we only keep around the pieces of information we need. This
structure is about 120 bytes on a 64-bit machine last time I
checked.

#3, in my experience, is by far the greatest thing to plane down
for top-notch TCP connection rates and performance. In fact, work
on this continues, we have experimental TIME_WAIT recycling code
in 2.3.x which attempts to safely kill TIME_WAIT state sockets
early. There are some final details to attend to, because if we
do this we must do it correctly, so there is a chance it may not
be finished and enabled in time for the final 2.4.x

Later,
David S. Miller
davem@redhat.com

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/

Next message: Bradley D. LaRonde: "SA_INTERRUPT"
Previous message: Jens Axboe: "Re: Pioneer DVD 114 vs. 2.2.5 ide driver"