RE: Signal driven IO

David Schwartz (davids@webmaster.com)
Wed, 17 Nov 1999 14:43:16 -0800


> Not at all. For one thing, you can have the same thread that does the
> 'poll' do all the actual I/O. This works well with a send
> queues and receive
> queues. Since non-blocking network I/O is so fast, it's almost
> impossible
> for this I/O thread to become a bottleneck.
>
> It depends on the application. Consider an IMAP server, with
> potentially 10,000+ open connections, and each connection is going to
> last for the entire mail-reading session, and will be idle most of the
> time except when the user makes some requests that requires going to the
> IMAP server. This is a very different access pattern than you would see
> on HTTP.

Whatever your access pattern, you have to design an implementation that's
consistent with that access pattern. For IMAP, I'd have an input thread and
an output thread. The output thread would 'poll' on only those fd's that I
have pending output for.

> If the IMAP server uses poll() to determine when the client has sent a
> new IMAP request, you don't want one thread handling every single IMAP
> command, since IMAP requests can take a long time to service.

You don't service them. You simply call 'read' to get the data and put it
in a receive queue. Then you queue a job to process the received data. Of
course, this 'handoff' has a cost associated with it. Oddly, even on NT
where you can avoid the handoff, I still do the handoff just because it's
easier to do so, I prefer to segregate my I/O threads from my server threads
(I feel the cost is generally worth the benefit, but others in pursuit of
pure speed may not agree).

For IMAP, (using poll), I would segregate the read and write portions
becaue they are so different. I'd have one thread in a big poll loop for the
read side. If it found an active connection, it would read the data, put it
in a receive queue, and queue a server thread to do the actual work.

The server thread would do the work, and put the results on a 'send queue'.
It would then add the fd to the write poll list. The write thread would
likewise be in a poll loop. When an fd was a write hit, it would write as
much data from the receive queue as it could.

Presumably, the write poll thread would experience more wakeups. But it
would be polling on fewer fds.

Now, since this is a case where it should be impossible for network I/O to
be the limiting factor, this should be just as scalable as an events-based
implementation. However, I would expect an events based implementation to be
better overall since its CPU usage is lower at 'low network load'.

> It's not
> just non-blocking network I/O. With RT SIGIO, different threads will
> automatically get dispatched for each new connection that has a new IMAP
> request on it. With poll() you will need to manually dispatch the
> request to a new thread.

Yes, but the cost of this dispatch reducses as load increases. This is
because one I/O thread returning from poll can queue any number of I/O jobs
in one 'pass'. And one 'worker' thread can dequeue and service any number of
jobs without a context switch. On two processors, it's even cheaper since an
I/O thread can keep queuing jobs on one as a 'worker' thread keeps doing the
jobs on another.

DS

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/