Re: [PATCH] tsc_khz= boot option to avoid TSC calibration variance

From: john stultz
Date: Tue May 12 2009 - 19:03:00 EST


On Wed, 2009-05-13 at 00:42 +0200, Thomas Gleixner wrote:
> On Tue, 12 May 2009, john stultz wrote:
> > > stored or accumulated before. For the TSC calibaration one could build the average
> > > of previous calibration values (if the value jumps between different numbers).
> > > That would require a sysctl (or equivalent) interface however.
> >
> > I still feel that a sysctrl or /sys/ interface for this sort of thing is
> > overkill. It creates yet another interface we have to manage, and really
> > doesn't improve the situation more then the boot option would.
> > Additionally it would require quite a bit of work at the clocksource
> > level to allow for re-calibration (currently we avoid this by
> > disqualifying the TSC if it changes freq).
> >
> > That said, I've gotten very few positive comments from my patch, so I'm
> > going to give it one more spin (to address Serge's point) and if folks
> > are still feeling blah about it I'll stop pushing it.
>
> I'm fine with the command line option, but I refuse to add some sys/
> thingy which makes us add extra calibration stuff.
>
> Honestly all this is just the futile attempt to fix the flaws of NTP
> via (super)user interaction.
>
> Darn, it can not be that hard to adjust the math to do what you think
> it should do. I'm not an expert on that NTP stuff, but blindly
> stuffing the last known value into the kernel and expect that the
> calibration value did not change is nonsense.

Well, only nonsense when using the TSC. Given that every other bit of
hardware either sticks to a fixed freq or provides its frequency to the
OS.

And the fact that boot-to-boot the TSC calibration code provides such
varying results illustrates how poor our calibration is, or how poor the
TSC's interface is for not providing its freq.

That said, the TSC is fast and fine grained, so its hard to tell folks
not to use it. Hard enough to convince people to avoid it when its
halting and changing frequency, and really can't be used for
timekeeping.

> You know the calibration value which created the last known parameters
> and you want an extra interface to inject this last known calibration
> value into the kernel instead of doing the math of adjusting the NTP
> parameters according to the change of calibration values ?

Again, I'd really not suggest NTP try to handle TSC calibration error
compensation itself. Either through the kernel or internally.

A) It would have to add linux-x86 specific kludge code
B) It would have to know to look at which clocksource was being used and
decide what to do from there (since ACPI_PM, HPET, PIT and even jiffies
doesn't need this extra tweaking).

Instead I propose,
1) Improving the calibration code as best we can, given the time
constraints at boot-up.

A neat idea from Miroslav Lichvar: Round the TSC calibration results to
the nearest 100ppm. Although I suspect we already see variations beyond
100ppm, so some tweaking would probably be necessary. I'll be looking at
this option soon.

2) Provide a workaround (that doesn't create some sort of userland API
we have to manage forever) for users who really care about the
super-fine details of tsc calibration and its interactions with NTP.

This patch only provides #2 above.

thanks
-john


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/