Re: Linux 2.6.29-rc6

From: Ingo Molnar
Date: Fri Feb 27 2009 - 02:33:49 EST



* john stultz <johnstul@xxxxxxxxxx> wrote:

> On Thu, 2009-02-26 at 14:40 -0800, Linus Torvalds wrote:
> >
> > On Thu, 26 Feb 2009, john stultz wrote:
> > >
> > > I'll kick up some of my own testing between these two releases to see if
> > > I can't find something similar.
> >
> > Since the PIT timer read is possibly hw-dependent, it might be that you
> > can't necessarily reproduce it on some random hardware.
> >
> > How sensitive is ntpd to (stable) drift? IOW, if we get the calibration
> > wrong, the TSC should still hopefully be very _stable_, it's just that the
> > initial guesstimate for the frequency is off and ntp would have to correct
> > for that.
>
> NTP can adjust the clock about +/-500ppm (so a 1000ppm range).
> Past that it starts throwing errors.

Well, it will start throwing errors but still it will correct
the clock and find the frequency delta between the host clock
and the reference clock just fine, and converge in a couple of
hours, correct?

500ppm is 0.05% of a frequency drift which is awfully small -
thermal effects alone can cause such differences so it should
not be anything out of the ordinary for ntpd.

> Part of the issue is that if the drift value changes in
> between boots, NTPd can take a while to settle down on the
> right freq. I suspect that's whats happening here, and should
> the box be left alone for a few hours (maybe overnight) NTPd
> will find the new drift correction the issue will go away.

If the default poll interval of 64 seconds is used then it can
take that much time - so i'd sugges to decrease that to below 10
seconds.

It's not like the frequency is changing rapidly here. The
correction pattern to find is a very simple and very static and
reliable multiplicator of ~1.000800 between the two frequencies.

Say the over-the-network reference clock ntpd follows has a 10
msecs of intrinsic observation noise. For that 10 msecs noise to
go down to the 10 ppm range [to the local but drifted time
source which has ~10 ppm precision straight away], we need
roughly 1000 samples. [simplified, fewer are enough in reality,
especially if you have some known-to-have-converged-before
cached value to start out with.]

1000 samples with 64 seconds intervals can take half a day to
converge. 1000 samples with 1 second intervals takes just 15
minutes to converge.

We'll improve in-kernel calibration but calibration noise in the
0.05% range should be expected in some cases.

Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/