Roman Penyaev <rpenyaev@xxxxxxx> wrote:
Hi all,
+cc Jason Baron
** Limitations
<snip>
4. No support for EPOLLEXCLUSIVE
If device does not pass pollflags to wake_up() there is no way to
call poll() from the context under spinlock, thus special work is
scheduled to offload polling. In this specific case we can't
support exclusive wakeups, because we do not know actual result
of scheduled work and have to wake up every waiter.
Lacking EPOLLEXCLUSIVE support is probably a showstopper for
common applications using per-task epoll combined with
non-blocking accept4() (e.g. nginx).
Fwiw, I'm still a weirdo who prefers a dedicated thread doing
blocking accept4 for distribution between tasks (so epoll never
sees a listen socket). But, depending on what runtime/language
I'm using, I can't always dedicate a blocking thread, so I
recently started using EPOLLEXCLUSIVE from Perl5 where I
couldn't rely on threads being available.
If I could dedicate time to improving epoll; I'd probably
add writev() support for batching epoll_ctl modifications
to reduce syscall traffic, or pick-up the kevent()-like interface
started long ago:
https://lore.kernel.org/lkml/1393206162-18151-1-git-send-email-n1ght.4nd.d4y@xxxxxxxxx/
(but I'm not sure I want to increase the size of the syscall table).