Re: [PATCH] correct inconsistent ntp interval/tick_length usage
From: Roman Zippel
Date: Sun Feb 10 2008 - 13:52:18 EST
Hi,
On Fri, 8 Feb 2008, john stultz wrote:
> -ENOPATCH
>
> We're taking weeks to critique fairly small bug fix. I'm sure we both
> have better things to do then continue to misunderstand each other. I'll
> do the testing and will happily ack it if it resolves the issue.
I don't want to just send a patch, I want you to understand why your
approach is wrong.
> Now, If you're disputing that I'm correcting the wrong side of the
> equation, then we're getting somewhere. But its still not clear to me
> how you're suggesting the other side (which is calculated in
> ntp_update_frequency) be changed.
> [..]
> You keep on bringing up NO_HZ, and again, the bug I'm trying to fix
> occurs with or without NO_HZ. The fix I proposed resolves the issue with
> or without NO_HZ.
The correction is incorrect for NO_HZ.
Let's try it the other way around, as my explanation seem to lack
something.
Please try to explain what this correction actually means and why it
should be correct for NO_HZ as well.
> > The only other alternative would be to calculate this correction
> > dynamically. For this you leave NTP_INTERVAL_LENGTH as is and when
> > changing clocks you check whether "abs(((cs->xtime_interval *
> > NTP_INTERVAL_FREQ) >> cs->shift) - NSEC_PER_SEC)" exceeds a certain limit
> > (e.g. 200usec) and in this case you print a warning message, that the
> > clock has large base drift value and is a bad ntp source and apply a
> > correction value. This way the correction only hits the very few system
> > which might need it and it would be the prefered solution, but it also
> > requires a few more changes.
>
> Uh, that seems to be just checking if the xtime_interval is off base, or
> if the ntp correction has gone too far. I just don't see how this
> connects to the issue at hand.
Above is the key to understanding the problem, if this difference is small
enough there is no need to correct anything.
This is the original patch which introduced the correction:
http://git.kernel.org/?p=linux/kernel/git/torvalds/old-2.6-bkcvs.git;a=commitdiff;h=69c1cd9218e4cf3016b0f435d6ef3dffb5a53860
Keep in mind that at that time PIT was used as the base clock (even if the
tsc was used, it was relative to PIT). So how much of those assumptions
are still valid today (especially with NO_HZ enabled)?
bye, Roman
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/