Re: Thread implementations...

Dean Gaudet (dgaudet-list-linux-kernel@arctic.org)
Sat, 20 Jun 1998 14:13:30 -0700 (PDT)


On Sun, 21 Jun 1998, Richard Gooch wrote:

> People seem to be very interested in the new connection case. This
> doesn't seem all that exiting or interesting to me (just have one
> thread blocking in accept(2) and store the new FD in a global
> array). To me the problem of processing data on existing connections
> is more interesting (and harder to solve: hence more interesting:-).
> Is there something deep I'm missing here?

The new connection case is actually pretty much the same as all the other
cases, but maybe just easier to explain.

Suppose you do what you suggest. Have a single accept() thread which
plops FDs into a global queue. It also presumably tweaks a condition
variable to awake a waiting processing thread. To start processing a new
socket there are two context switches, one into the accept thread, and one
into a processing thread.

That second switch is a waste. Instead you could mutex protect accept()
and go into it with the processing thread, and release the mutex on the
way out. Then you have only one context switch for each new socket. This,
incidentally, is almost what happens inside the kernel... except the
kernel uses wake-all semantics (freebsd seems to have solved this for
accept... alan and linus say there are difficulties in solving it, so it
hasn't been solved in linux yet). So you can actually drop the mutex.

Back to the single thread/accept queue. There's only a single thread in
accept(), and if the box you're running on has two processors you're not
exploiting the parallelism available. You could do some fun gymnastics at
the user level to put multiple threads waiting on accept() ... but that's
overkill because usually the kernel is the best judge of the parallelism
available. So just putting a collection of threads into accept() and
letting the kernel sort it out solves this.

But does it? Now you have to figure out how many threads you should have
waiting in accept at any one time. (In Apache, this is the joyful
nonsense of deciding the right MinSpareServers and MaxSpareServers
settings to handle load spikes and parallelism and all that fun stuff.)
And your threads waiting in accept are kernel scheduled resources
consuming kernel ram.

If all your program did was call accept() you'd be able to figure this all
out pretty easily. But presumably you do more than that.

accept() is interesting because it is actually an event queue... it's a
queue of new connections arriving on a single socket. The kernel has all
the knowledge it needs to multiplex the socket connections in a way
suitable to the hardware.

But accept() is limiting because it only handles a single listening
socket. If your web server has both port 80 and port 443, you need some
way to accept connections on both. You prefer to run a single web server
to take advantage of shared configuration memory and other resources. Now
you need some method of accepting connections on multiple sockets. You
could just implement two pools of threads, one for each socket. But that
doesn't scale to many many sockets (which some people actually do use, for
better or for worse) ... and now you have to tune min/maxspare parameters
for multiple pools, what a headache.

What you'd really like is a way to say "accept a connection on any of
these sockets" so that you can continue to maintain a single pool of
threads. The single pool is not only easier to configure, it has the
benefits of cache locality. Presumably everything in the pool is
identical -- all the threads are capable of handling the returned socket.
The kernel can use LIFO on the waiting threads because the last-in thread
is most likely to still have data in L1.

But really the same can be said for read/write as well as accept. Suppose
you had a hybrid user/kernel threads package which uses co-operative
pre-emption, i/o points are pre-emption points. When a user-thread does
an i/o the package just notes that the user-thread is blocked. Then it
asks the kernel "give me an FD which is ready for I/O". It determines which
user-thread is waiting for that FD and dispatches that user-thread. In
this model you need as many kernel threads as there is parallelism to
exploit. The user-threads are written in the standard procedural manner,
which is easy to program (rather than the stateful manner of something
like squid where all the state transitions for i/o state are explicit
and in the control of the programmer).

Central to that is the "give me an FD which is ready for I/O" step. This
is where select/poll are traditionally used... but the question is really
a wake-one question, and select/poll are wake-all primitives. The
kernel-threads in this example are all equivalent, any one of them can
switch to any of the user-threads. Can you see how read/write are
pretty similar to accept and how all the problems are related?

Dean

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu