Re: Timings for optimised poll(2)

Richard Gooch (rgooch@atnf.CSIRO.AU)
Mon, 25 Aug 1997 14:29:02 +1000

Messages sorted by: [ date ][ thread ][ subject ][ author ]
Next message: Dean Gaudet: "Re: Timings for optimised poll(2)"
Previous message: Dean Gaudet: "Re: Timings for optimised poll(2)"

Dean Gaudet writes:
> On 25 Aug 1997, Michael O'Reilly wrote:
>
> > This is one of the reasons you use select(). It means you can scan a
> > bit array very fast.
>
> and the kernel scans a normal array really slow underneath. Somewhere
> someone pays the costs in the select and poll models.

With small changes to the existing model, we can get significant
speedups. A problem I see with ioevents is it's a quite different
model.

> select() is O(n) where n is the max descriptor. poll() is O(m) where m is
> the number of descriptors. Both are linear in their inputs ... m <= n...
> poll is generally more efficient at the kernel level since select is
> implemented in terms of poll. Even in the cases where the outputs are
> sparse and find_first_zero_bit becomes a win, you still had to pay for
> O(n) operations inside select(). So timing things at the user level only
> is totally bogus.

I sent the following analysis offline to someone else, but it's
probably of wider interest:

Let's compare the costs:

select(2):
- user-space copy of fd_sets (small copy)
- kernel-space iteration for each fd to do __get_user for each event
type: hey! that's *3* scans
- kernel-space clearing of fd_set_buffer (3 small clears)
- kernel-space iteration for each fd to make poll masks with 3 bit
manipulations
- kernel-space iteration for each fd to do __put_user for each event
type: another 3 scans
- user-space first-first/find-next scan to find who's active (fast scan)

I count: 7 full scans of the fd list
1 small copy
3 small clears
1 fast scan

poll(2):
- kernel-space copy_from_user of pollfds
- kernel-space iteration for each fd to manipulate poll mask
- kernel-space iteration for each fd to do __put_user
- user-space iteration for each fd to find who-s active

I count: 3 full scans of the fd list
1 full copy

poll2(2):
- kernel-space copy_from_user of pollfds
- kernel-space iteration for each fd to manipulate poll mask
- kernel-space iteration for each ready fd to do __put_user
- user-space iteration for each ready fd to find who-s active

I count: 1 full scan of the fd list
1 tiny scan/copy of ready fds
1 tiny scan of ready fds

Regards,

Richard....

Next message: Dean Gaudet: "Re: Timings for optimised poll(2)"
Previous message: Dean Gaudet: "Re: Timings for optimised poll(2)"