Re: Problem with kernel-pll in 2.0.3x (at least)

Ulrich Windl (ulrich.windl@rz.uni-regensburg.de)
Mon, 4 May 1998 09:27:53 +0200


Last week there was some discussion about time keeping with 1024 Hz,
as used in the DEC Alpha. There seem to be some problems which I was
unaware of.

-----------
> From: Jon Peatfield <J.S.Peatfield@damtp.cam.ac.uk>
> Message-Id: <E0yV9Nc-0001SU-00@kro.amtp.cam.ac.uk>

> > I have no alpha, but isn't the tick changed once per second, or
> > something like that (no source here...)?

> Well looking though various files in linux/kernel/ I don't really understand
> all that is going on, sched.c for example has code #ifdef'd on HZ being 100 to
> perform some clever fix.

> However, taking a trivial belief in the workings (that the clock is updated by
> tick uS every timer interrupt (i.e. HZ times per sec)) I estimate the
> following:

> tick = 1000000 / 1024 = 976.56250
> -> tick is actually set to 977, so in 1 second time is updated by
> 977 * 1024 uS which is 1000448 which is 448ppm too fast.

[...]
----------

I have found out that Linux, unlike other implemenations, does not adjust tick
to get a more accurate time. I have a proposal for better timekeeping I'll show
at the end of this message. (I won't implement it because I have no Aplha)

[...]
> Running xntpd3-5.93 modified to use the kernel-pll with the MAXFREQ set to
> 2048 (rather than the default 200ppm) the system stabelised after a few hours
> with freq set to -438.308 ppm which is close enough for me to believe this is
> really what is happening in the kernel.

While a workaround, I'd really avoid that. The trick is to get rid of the
systematic error first.

[...]
> A bit more digging shows the following definitions in sched.c:
>
> long tick = (1000000 + HZ/2) / HZ; /* timer interrupt period */
> int tickadj = 500/HZ; /* microsecs */
> long time_freq = ((1000000 + HZ/2) % HZ - HZ/2) << SHIFT_USEC;
> /* frequency offset (scaled ppm) */
>
> when HZ is 100 tickadj is 5, but once HZ is over 500 this is zero (is that
> right?) The time_freq is very interesting, at HZ=100 it is 0, and at HZ=1024

Definitely that's a bug! tickadj can't be zero, because the time can't adjust then.
Fortunately there's an extended adjtimex call to modify tickadj without even
rebooting (if you have the patch...my patch).

[...]
> already been running. This means that the value for drift is different when
> the kernel-pll is used .vs. an external pll xntpd if 1000000 is not divisible
> by HZ!

Xntpd tried to compensate the experienced clock error. It does not matter where it
comes from a software bug, or from environmental conditions like hardware. It's
quite bad if the software breaks timekeeping as done for 1024 HZ!!!

[...]
(The xntp patch)
> ***************
> *** 260,265 ****
> --- 263,269 ----
> /*
> * Fetch timekeeping data and display.
> */
> + ntv.modes=0;
> status = ntp_gettime(&ntv);
> if (status < 0)
> perror("ntp_gettime() call fails");
> --cut-here--

Be careful: standard ntp_gettime does not use "modes", and I'm unsure whether
other implemenations have it. You possibly break things. (The ntp_gettime() should
(as indicated by the name) only get the time, thus implying "modes == 0").

(Philip Gladstone <philip@raptor.com> proposed a kernel change)

Before adding my code snipplet (Perl, but C hackers can surely understand it),
I'd like to remind you to remove "stenn@whimsy.udel.edu" if the answer is unrelated to xntp.

Regards,
Ulrich

#!/usr/bin/perl -w
# kernel time simulator for 1024 interrupts per second
# (c) 1998 by Ulrich Windl
use 5.004;
use integer;
my $jiffies = 0;
my $t = 0;

# Using 1024 Hz as interrupt frequency the following is true:
# Using 976us as tick you'll have 999424us per second, and 576us error
# Using 977us as tick you'll have 1000448us per second, and 448us error
# Using 976us and 977us every second tick, you'll have 999936us per second,
# and 64us error
# Using 976us and 977us every second tick, plus 978 every 16th tick, you'll
# have no error (while your time is still quite steady)! 8-)
sub do_jiffies($)
{
my $count = shift;
print "processing $count timer interrupts (at 1024 Hz)...\n";
while ($count--) {
if ($jiffies % 2) {
$t += 976;
} elsif ($jiffies % 16) {
$t += 977;
} else {
$t += 978;
}
++$jiffies;
}
}

printf("time = %9u, jiffies = %9u\n", $t, $jiffies);
foreach my $c (1, 1, 2, 4, 8, 16, 32, 64, 128, 256, 512, 1024, 2048, 4096) {
do_jiffies($c);
printf("time = %9u us, jiffies = %9u interrupts\n", $t, $jiffies);
}

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu