Re: [PATCH] async poll for 2.5

From: Dan Kegel (dank@kegel.com)
Date: Tue Oct 15 2002 - 16:09:34 EST


John Gardiner Myers wrote:
>
> Benjamin LaHaise wrote:
>
> >If you look at how /dev/epoll does it, the collapsing of readiness
> >events is very elegant: a given fd is only allowed to report a change
> >in its state once per run through the event loop.
> >
> And the way /dev/epoll does it has a key flaw: it only works with single
> threaded callers. If you have multiple threads simultaneously trying to
> get events, then race conditions abound.

Delaying the "get next batch of readiness events" call as long as
possible
increases the amount of event collapsing possible, which is important
because
the network stack seems to generate lots of spurious events. Thus I
suspect
you don't want multiple threads all calling the "get next batch of
events"
entry point frequently.
The most effective way to use something like /dev/epoll in a
multithreaded
program might be to have one thread call "get next batch of events",
then divvy up the events across multiple threads.
Thus I disagree that the way /dev/epoll does it is flawed.

> I certainly hope /dev/epoll itself doesn't get accepted into the kernel,
> the interface is error prone. Registering interest in a condition when
> the condition is already true should immediately generate an event, the
> epoll interface did not do that last time I saw it discussed. This
> deficiency in the interface requires callers to include more complex
> workaround code and is likely to result in subtle, hard to diagnose bugs.

With queued readiness notification schemes like SIGIO and /dev/epoll,
it's safest to allow readiness notifications from the kernel
to be wrong sometimes; this happens at least in the case of accept
readiness,
and possibly other places. Once you allow that, it's easy to handle the
condition you're worried about by generating a spurious readiness
indication when registering a fd. That's what I do in my wrapper
library.

Also, because /dev/epoll and friends are single-shot notifications of
*changes* in readiness, there is little reason to register interest in
this or that event, and change that interest over time; instead,
apps should simply register interest in any event they might ever
be interested in. The number of extra events they then have to ignore
is very
small, since if you take no action on a 'read ready' event, no more
of those events will occur.

So I pretty much disagree all around :-) but I do understand where
you're
coming from. I used to feel similarly until I figured out the
'right' way to use one-shot readiness notification systems
(sometime last week :-)

- Dan
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Tue Oct 15 2002 - 22:00:57 EST