Re: Linux's implementation of poll() not scalable?

From: Linus Torvalds (
Date: Thu Oct 26 2000 - 11:44:21 EST

On Thu, 26 Oct 2000, Dan Kegel wrote:
> With level-triggered interfaces like poll(), your chances
> of writing a correctly functioning program are higher because you
> can throw events away (on purpose or accidentally) with no consequences;
> the next time around the loop, poll() will happily tell you the current
> state of all the fd's.


However, we also need to remember what got us to this discussion in the
first place. One of the reasons why poll() is such a piggy interface is
exactly because it tries to be "nice" to the programmer.

I'd much rather have an event interface that is documented to be edge-
triggered and is really _lightweight_, than have another interface that
starts out with some piggy features.

I do understand that level to some degree is "nicer", but everybody pretty
much agrees that apart from requireing more care, edge-triggered certainly
does everything the level-triggered interfaces do.

For example, if you want to only partially handle an event (one of the
more valid arguments I've seen, although I do not agree with it actually
being all that common or important thing to do), the edge-triggered
interface _does_ allow for that. It's fairly easy, in fact: you just
re-bind the thing.

(My suggested "bind()" interface would be to just always add a newly bound
fd to the event queue, but I would not object to a "one-time test for
active" kind of "bind()" interface either - which would basically allow
for a poll() interface - and the existing Linux internal "poll()"
implementation actually already allows for exactly this so it wouldn't
even add any complexity).

> With edge-triggered interfaces, the programmer must be much more careful
> to avoid ever dropping a single event; if he does, he's screwed.
> This gives rise to complicated code in apps to remember the current
> state of fd's in those cases where the programmer has to drop events.

No, the "re-bind()" approach works very simply, and means that the
overhead of testing whether the event is still active is not a generic
thing that _always_ has to be done, but something where the application
can basically give the kernel the information that "this time we're
leaving the event possibly half-done, please re-test now".

Advantage: performance again. The common case doesn't take the performance


To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to
Please read the FAQ at

This archive was generated by hypermail 2b29 : Tue Oct 31 2000 - 21:00:18 EST