On Fri, 1 Sep 2000, Mike A. Harris wrote:
> I've heard comments from Alan, and others in the past bashing
> threads, and I can understand the "threads are for people who
> can't write state machines" comments I've heard, but what other
> ways are there of accomplishing the goals that threads solve in
> an acceptable manner that gives good performance without coding
the thread bashing is mostly bashing programs which do things such as a
thread per-connection. this is the most obvious, and easy way to use
threads, but it's not necessarily the best performance, and certainly
doesn't scale. (on the scalability side just ask yourself how much RAM is
consumed by stacks -- how many cache lines will that consume, and how many
TLB entries. it sucks pretty fast.)
state machines are hard. while people on linux-kernel may be hot shot
programmers who can do state machines in their sleep, this is definitely
not the case for the average programmer.
fortunately fully threaded and fully state driven are two opposite ends of
a spectrum, and there are lots of useful compromises in between where
threads are used in a way that allows the average programmer to
maintain/extend a codebase; while also getting the scalability of state
this is where "worker thread pools" and message queues come in.
for example, consider an IMAP server. the fully threaded implementation
would have a thread per connection. but most connections sit idle most of
the time. instead restructure it so that there is a thread per
in-progress command; plus one thread doing a state machine for all the
idle connections. this scales much better because you only have as many
threads as active commands. the actual IMAP code itself can be written in
the comfortable threading model rather than a state model.
it's still not perfect -- large mail messages sent to slow clients chew a
thread for an inordinate amount of time. an obvious next modification
would be to let the state thread also handle the sending of messages. this
can be done relatively cleanly.
part of the apache-2.0 design was to allow for such a setup -- using a
thread for the "thinking" part of each HTTP request, and once a response
object is decided on, it is passed back to the state machine. (responses
generally fit into the categories of disk file, pipe, or memory region.)
but nobody has yet implemented beyond thread-per-connection.
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to email@example.com
Please read the FAQ at http://www.tux.org/lkml/
This archive was generated by hypermail 2b29 : Thu Sep 07 2000 - 21:00:12 EST