Re: accept() improvements for rt signals

From: Julian Anastasov (uli@linux.tu-varna.acad.bg)
Date: Sun Feb 20 2000 - 15:55:42 EST


        Hello,

On Sun, 20 Feb 2000, Dan Kegel wrote:

> Julian Anastasov (uli@linux.tu-varna.acad.bg) wrote:
> > 2. accept() the connection
> > 3. fcntl(connected_socket,F_SETFL,O_NONBLOCK|O_ASYNC|O_RDWR)
> > 4. fcntl(connected_socket,F_SETSIG,sig)
> > 5. fcntl(connected_socket,F_SETOWN,again_my_pid)
> > 6. read(connected_socket...), sigwaitinfo()
> >[has possible races, so]
> > My proposal: is it possible to add SO_INHERITOWN or other
> > socket/file (F_COPYOWN) option suitable for listening sockets
> > with the main goal to let sys_accept() to copy the fd owner after
> > get_fd(2.2.x) or sock_map_fd(2.3.x) in net/socket.c.
>
> Yep. You're not the first to notice this.
> Zach Brown has been finding and working around races just
> like this while writing phhttpd. In email to Zach and me,
> sct@redhat.com proposed:
>
> > One thing we can do to deal with some of this is to have a new fcntl to
> > augment F_SETOWN, FASYNC and F_SETSIG. A F_SETFAST() would perform a
> > setsig and a setown at once, _and_ would accept a poll struct into which
> > it would fill the current state of the socket. That would provide a
> > single-syscall socket setup mechanism which guarantees not to lose any
> > events (events that happened before the SETFAST would be returned in the
> > poll struct).

        Sounds good but we have to use this system call after each
accept() which compared with the fd ownership/flags inheritance
is slower. But 1 syscall is better than 4 syscalls.

>
> Here's another possible race:
> User closes fd 5. Events for fd 5 are still in the signal queue.
> User does accept(), which returns a new fd 5.

        If we use small number of listeners (usually one) we
can flag that sigwaitinfo() returned info about activity on
the listener and to accept() the next connections after
ensuring the rt queue is empty. By this way we know that no new
fd is reused after our close() and there is no notification
in the queue for already closed sockets. OK, it is possible
the accept() to be delayed forever if the rt queue never gets
empty, i.e. on very busy server. We can't stop the kernel to
enqueue more and more events.

        I think in normal situations we don't need notifications
from accept(). These events are synchronous. Is there any reason
for these notifications? If the accepted socket is automatically
setup for rt signals we don't need notifications from accept().
But may be they are required if there are still events in the rt
queue for the closed descriptors and we must know the order of
the events.

>
> Your proposed new ioctl could eliminate both the race you're worried about
> *and* the one I just pointed out if it also delivered a signal that said
> "fd 5 created". The user would then discard any events for fd 5
> received after the close() but before the "fd 5 created" signal.

        But this problem is not only for accept(). We can create
non-socket descriptors and their events must be notified too.
This sounds very complex.

We have two variants:

1. close() to dequeue the events and to do it very fast.

2. If our program closed a fd and sigwaitinfo() returns event
for listener we have to call accept() after the rt queue is empty
or at least after the last event currently in the rt queue.

>
> That "fd 5 created" signal would tell the user code to reset its
> "fd 5 poll status" variable, which would then be updated by the sigio
> signals whenever fd 5's poll status changed. No need for an initial
> call to poll() or a poll struct * in F_SETFAST then.

        In fact, I performed only some tests. But I think we
even don't need to fallback to poll() on SIGIO. I use large
value for the rt queue size (rtsig-max). But may be the overflow can
happen on very busy host? Is there any reason rtsig-max to be
greater than num_possible_sock_events*num_of_possible_async_fd in
current list of supported events ? I.e. for sockets:
2 events (read/write) * 32768 sockets = 65536 ? If I use up to
32768 async fds I need to setup the rtsig-max=65536. If we use
sigwaitinfo() together with the proper socket calls we can't
have more enqueued events? Is this correct? I still didn't tested
how the kernel performs with long rt queue. If we never reach the
rtsig-max why should I need to wait for SIGIO?

        So, what about adding flag to let accept() to copy
the ownership (f_owner) and fd flags (file->f_flags) from
the listener to the connected socket. I don't see other races
if this is the only change to the current kernels. The other
problem is how to ensure that the events are not for already
closed descriptors. May be we can play with different signal
numbers, i.e. we have to check if the signum in the event
is the expected signum for the descriptor. If not, this is
event for old fd. We need min 2 signal numbers for this.
We have to remember which was the last signum used for
this fd and to set another one using F_SETFAST (if added in
the kernel). May be there are other better solutions.

Regards

--
Julian Anastasov <uli@linux.tu-varna.acad.bg>

- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.rutgers.edu Please read the FAQ at http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Wed Feb 23 2000 - 21:00:25 EST