Re: Timings for optimised poll(2)

Richard Gooch (rgooch@atnf.CSIRO.AU)
Mon, 25 Aug 1997 13:11:02 +1000


Michael O'Reilly writes:
> Richard Gooch <rgooch@atnf.CSIRO.AU> writes:
> >
> > There are a few reasons why you would not use the find_first_zero_bit
> > function in a real-world application:
> >
> > 1) it assumes you are using select(2) and fd_sets. This has the old
> > problem of hard-wired limits on the fd_set size. Today the limit is
> > 1024. Tomorrow somebody wants more, so you have to recompile your
> > kernel. If you want to manage a large number of descriptors, you would
> > use poll(2) instead
>
> No, you only recompile your application. The kernel get passed the
> number of bits in the set. If you've hard coded limits into the
> kernel, that's a seperate issue.

If you have a closer look at linux/fs/select.c you will see there is a
limit of 5120 descriptors possible with select(2), because the kernel
allocates a single page for its fd_set_buffer. This is not simply a
"tunable parameter" you can tweak in a header file before compiling
your kernel.
Using poll(2) does not have this problem (you only need to tweak a
header).

> > 2) every time poll(2) or select(2) returns, you may get dozens of
> > descriptors which are ready. How do you solve this? Call
> > find_first_zero_bit for each active descriptor (clearing as you go)?
>
> Why not? It's still a hell of a lot faster than the silly code you
> have below. The vast majority of the time, only a handfull of the bits
> are going to be set.

With 10 000 connections, I can well imagine 50 active connections per
call to poll(2). Is that a "handfull"?

> > I've appended my latest version test programme. Compile with -O2. This
> > contains a *realistic* scanning loop. It takes 1.5 milliseconds on a
> > Pentium 100.
>
> Congratulations. You managed to time a naive scan through a large
> array.
>
> This is one of the reasons you use select(). It means you can scan a
> bit array very fast.

Now, if you think a little more about what select(2) does, you will
note that you have to set up the fd_sets before you call it, *each and
every time*, because the fd_sets are modified by the call. So, to set
up the fd_sets, you have to do a "naive" scan through the
list. Clever.
What I do with poll(2) is to set of the pollfd array *once*, since the
call will not modify the fd and events fields. I can then call poll(2)
many times without setting up the input array.

Now take a closer look at the implementation of select(2): you will
notice that it scans the input fd_sets to construct the poll masks. So
there is yet another complete scanning of the list. So this means that
using select(2) requires at least two complete scans without clever
bit-manipulation, plus another complete scan with possible
bit-manipulation.
This means using poll(2) is *faster* than using select(2), because you
don't have to do a scan with possible bit-manipulation.

Let's go back to my original poll2(2) suggestion: the kernel passes
back the list of active descriptors. This means that *only one* scan
of the list needs to be done per call. This is even better than using
poll(2) which requires two scans per call.

Are you still convinced that select(2) is so efficient?

Regards,

Richard....