Re: epoll design problems with common fork/exec patterns

From: Davide Libenzi
Date: Thu Feb 28 2008 - 14:23:36 EST


On Thu, 28 Feb 2008, Michael Kerrisk wrote:

> But it is an ugly inconsistency. On the one hand, a child process
> cannot add the duplicate file descriptor to the epoll set. (In every
> other case that I can think of , descriptors duplicated by fork have
> similar semantics to descriptors duplicated by dup() and friends.) On
> the other hand, the very fact that the child has a duplicate of the
> descriptor means that even if the parent closes its descriptor, then
> epoll_wait() in the parent will continue to receive notifications for
> that descriptor because of the duplicated descriptor in the child.

Have you ever tried to think what it means for different *processes*
sharing a single epoll fd and doing epoll_wait() over it?
Most common case is a single event fetch thread plus dispatch. Going to
epoll_wait() over a single epoll fd from many *threads* is very much
possible, but requires care (news at 11, system software development
requires care too).
Sharing a single epoll fd (by the means of any process sharing it doing
add/wait) from different *processes* makes almost no sense at all.
"a child process cannot add the duplicate file descriptor to the epoll
set" ... how do you expect the parent (that doesn't even have the new fd
mapped) to react to such events?
If the next question is "But then why we made the epoll fd inheritable?",
the answer is, because it makes sense in many cases for a parent to hand
over an fd set to a child.



> The choice of [file *, fd] as the key for epoll sets really does seem
> unfortunate. Keying on [pid, fd] would have given saner semantics, it
> seems to me. Obviously it can't be changed now though.

I think we already went over this, and I think I clearly explained you the
reasons of not hooking into sys_close.



- Davide


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/