Re: epoll behaviour after running out of descriptors

From: Davide Libenzi
Date: Sun Nov 02 2008 - 17:49:57 EST


On Sun, 2 Nov 2008, Olaf van der Spek wrote:

> On Sun, Nov 2, 2008 at 10:17 PM, Davide Libenzi <davidel@xxxxxxxxxxxxxxx> wrote:
> >> Wouldn't the port space require about 20+ k connects? This issue
> >> happens after 1 k.
> >
> > The reason for "When accept returns EMFILE, I call epoll_wait and accept
> > and it returns with another EMFILE." is because your sockets-close logic
> > is broken.
>
> It's not broken, it's designed that way. It's designed to hit the
> descriptor limit and then close all sockets some time after.
>
> > You get an event for the listening fd, you go call accept(2)
> > and in one or two passes you fill up the avail fd space, then you go back
> > calling epoll_wait(), and yet back to accept(2). This w/out triggering the
> > file-close-relief code (yes, you fill up 1K fds *before* 30 seconds). Of
> > course you get another EMFILE.
>
> The second EMFILE doesn't make sense, epoll_wait shouldn't signal the
> socket as ready again, right?

At the time of the first EMFILE, you've filled up the fd space, but not
the kernel listen backlog. Additions to the backlog, triggers new events,
that you see after the first EMFILE. At a given point, the backlog is
full, so no new half connections are dropped in there, so no new events
are generated.
Again, sleeping on (EMFILE && ET) is bad mojo, and nowhere is written that
events should be generated in the EMFILE->no-EMFILE transitions.



- Davide


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/