Re: poll/select timout accuracy

Kenneth Albanowski (kjahds@kjahds.com)
Tue, 21 Jul 1998 12:43:46 -0400 (EDT)

Messages sorted by: [ date ][ thread ][ subject ][ author ]
Next message: A. Wik: "Re: Caveat emptor Re: loop.c: DES bugfixes"
Previous message: peloy@ven.ra.rockwell.com: "Re: poor network performance in 2.1.10[58](factor of 8 to 15 worse)"
In reply to: Sid Boyce: "2.1.110-pre3 fails mem.o"

On Mon, 20 Jul 1998, Jean-Marie Sulmont wrote:

> Hello.
>
> I'm having problems with timer accuracy of both poll() and select()
> system calls on 2.1.xxx on an Intel P6 box where the cpu is clocked at
> 300Mhz. I'm joining a little program that demonstrates the problem.
> In two words, if you do:
> t1 = getclock();
> poll(0,0,10);
> t2 = getclock();
> then t2 - t1 is going to be 19.422559ms in avg. So the error is 9.42ms.
> The little program is computing the standard deviation and arithmetic mean.
> When the timeout given to poll is 100, the error drops to 9.18ms.
> When it is 550, it drops to 7.8ms. At 1second, the error is 6.553135ms.
>
> I'm using the real time 64 bits counter of the P6.
> I've tried to read the sys_select() code but does not know what a jiffies
> is. The program works on Sun OK, showing a constant and consistant error.
>
> This is *very* ennoying. Anyone has an idea? What am I doing wrong?

I can't describe exactly what is going on (having only recently studied
this myself, and not have looked at 2.1.x) but I can mention some points
(if any of this is wrong, everyone feel free to correct me):

1. Any time you enter a syscall (like select or poll), some time is
wasted during the transition to the kernel, and the increased chance of
wasting more time (by immediately switching to another task).

2. In 2.0.x, the timeout used by select() is only as precise to the
jiffy. I presume this holds to select() and poll() under 2.1.x. Jiffies
are a timing unit used on all CPUs, and are some multiple of the system's
oscillator frequency, usually winding up to be a few ms a piece. Their
length is defined by the HZ constant which (perhaps confusingly) describes
the number of jiffies per second. HZ is usually 100, with the Alpha being
the odd man out with 1024. The length of a jiffy is purely an
architectural decision, and it not affected by any common CPU speed
characteristics.

3. All system decisions on time, both for timekeeping and for deciding
which tasks to run, and when, are performed using jiffies, and, barring
syscalls, will only switch between tasks on jiffy intervals. The only
place more precise time is provided is by the gettimeofday() syscall,
which accounts (if possible on the hardware) for the current partial
jiffy. (Or by the CPU counter, which is completely independant from all
this.)

4. No traditional UNIX timing call purports to be precisely accurate
[sic], and due to the way UNIX usually works (switching away from a task
at "any time"), no traditional call _can_ be accurate.

5. It's impossible to measure periods <= 1 jiffy, since the kernel
doesn't (usually) know where it is within the jiffy. I expect it errs on
the large side, so it will (_effectively_) wait for the next jiffy to
occur, and start timing from there. I assume that with a perfect
distribution of times, you'd expect 15ms of delay, on average, given a
request of 10ms. However, I also assume that the distribution will not be
perfect, as "new things" will tend to happen directly after a jiffy tick.

6. I expect the 9.42ms error in the 10ms test is caused by
synchronization to the beginning of the tick (since the poll() routine
will never return before a tick completes, and returns very soon after),
so you are getting a delay of almost always 1.9999 jiffies, or 19.9999ms.
The final .3799ms is partially accounted for by poll() taking a little
time to clean up after itself, and periodic happenstance of task
switching. (A histogram ought to show any perodic effects.)

7. Your 6.55ms error at 1 second (100 jiffies) should be accounted for in
the same manner (but reduced, since the synchronization affect can only
add 9.9999ms, at most, regardless of the requested delay). Longer periods
should reduce it further, presumably asymptotically approaching 5ms +
whatever overhead is involved.

8. Doing actual tests with your code on a P5 system running 2.0.31 shows
similar inaccuracies, but with an additonal bit of fun: using periods of
10ms (20ms, 30ms, etc.) often shows _negative_ errors, apparently by
select() rounding down the desired jiffy count. Furthermore, I get
progressively larger errors with longer periods, implying something about
my clock or the kernel (though I'm really not sure what, at the moment.)

9. None of this matters on whit, unfortunately. As I said, way back in
point 4, UNIX (of which Linux are one) doesn't have guaranteed timing, and
indeed any general-purpose pre-emptively multitasking desktop OS is going
to have limited timing accuracy. That select() and poll() do anything
nearly this accurate is a miracle. For more accurate timing, you need to
investigate "real-time" options such as RT-Linux, special-purpose OSs, or
at least look at a fast Alpha box (which judging from a HZ of 1024, may
be able to reduce these inaccuracies by 10-fold).

-- Kenneth Albanowski (kjahds@kjahds.com, CIS: 70705,126)

- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.rutgers.edu Please read the FAQ at http://www.altern.org/andrebalsa/doc/lkml-faq.html

Next message: A. Wik: "Re: Caveat emptor Re: loop.c: DES bugfixes"
Previous message: peloy@ven.ra.rockwell.com: "Re: poor network performance in 2.1.10[58](factor of 8 to 15 worse)"
In reply to: Sid Boyce: "2.1.110-pre3 fails mem.o"