Re: epoll design problems with common fork/exec patterns

From: Eric Dumazet
Date: Sat Oct 27 2007 - 05:22:52 EST


Marc Lehmann a écrit :
On Sat, Oct 27, 2007 at 10:23:17AM +0200, Eric Dumazet <dada1@xxxxxxxxxxxxx> wrote:
In this case, the parent process works fine until the child closes fds,
after which the fds become unarmed in the parent too. This works as
I have no idea what exact problem you have.

Well, I explained it rather succinctly, I think. If you tell me whats unclear
I can explain...

But if the child closes some file descriptor that were 'cloned' at fork() time, this only decrements a refcount, and definitely should not close it for the 'parent'.

It doesn't. It removes it from the epoll set, though, so the parent will not
receive events for that fd anymore.

I have some apps that are happily using epoll() and fork()/exec() and have

The problem I described is fork/close/exec. close being the explicit
syscall.

no problem at all. I usually use O_CLOEXEC so that all close() are done at exec() time without having to do it in a loop. epoll continues to work as expected in the parent process.

This is because epoll doesn't behave like documented: It removes the fd
from the parents epoll set only on an explicit close() syscall, not on an
implicit close from exec.

fd sets. This would explain the behaviour above. Unfortunately (or
fortunately?) this is not what happens: when the fds are being closed by
exec or exit, the fds do not get removed from the epoll set.
at exec() (granted CLOEXEC is asserted) or exit() time, only the refcount of each file is decremented. Only if their refcount becomes NULL, files are then removed from epoll set.

Yes. But thats obviously not the only way to close fds.

Is epoll really designed to be so incompatible with the most commno fork
patterns? Shouldn't epoll do refcounting, as is commonly done under
Unix? As the fd space is not shared between rpocesses, why does epoll
try? Shouldn't the epoll information be copied just like the fd table
itself, memory, and other resources?
Too many questions here, showing lack of understanding.

You already said you don't the problem. No need to get insulting :(

epoll definitly is not useless. It is used on major and critical apps.
You certainly missed something.

Well, it behaves like documented, which is the problem. You admit you
don't understand the problem or the documentation, so again, no need to
insult me.

Hum... I will update my english vocabulary and mark "missed" as an insult.

I have no problem with epoll nor its documentation.


Please provide some code to illustrate one exact problem you have.

// assume there is an open epoll set that listens for events on fd 5
if (fork () = 0)
{
close (5);
// fd 5 is now removed from the epoll set of the parent.
_exit (0);
}


It doesnt on every kernels I had played with. And I played with *lot* of kernels you know.

If such a bug exists on your kernel, please fill a complete bug report, giving details.

Thank you

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/