Omar Kilani <omar.kilani@xxxxxxxxx> wrote:
Hi there,
I???m still trying to piece together a reproducible test that triggers
this, but I wanted to post in case someone goes ???hmmm... change X
might have done this???.
Maybe Davidlohr knows, since he's responsible for most of the
epoll changes in 5.0.
Basically, something???s broken (or at least, has changed enough to
cause problems in user space) in epoll since 5.0. It???s still broken in
5.1-rc5.
It doesn???t happen 100% of the time. It???s sort of hard to pin down but
I???ve observed the following:
* nginx not accepting connections under load
* A java app which uses netty / NIO having strange writability
semantics on channels, which confuses netty / java enough to not
properly flush written data on the socket.
I went and tested these Linux kernels:
4.20.17
4.19.32
4.14.111
And the issue(s) do not show up there.
I???m still actively chasing this up, and will report back ??? I haven???t
touched kernel code in 15 years so I???m a little rusty. :)