Re: [BUG] APM resume breakage from 2.6.18-rc1 clocksource changes

From: Thomas Gleixner
Date: Thu Jul 13 2006 - 16:22:23 EST

Next message: Eric Paris: "[PATCH] Fix security check for joint context= and fscontext= mountoptions"
Previous message: David Miller: "Re: 2.6.18-rc1 fails to boot on Ultra 5"
In reply to: john stultz: "Re: [BUG] APM resume breakage from 2.6.18-rc1 clocksource changes"
Next in thread: john stultz: "Re: [BUG] APM resume breakage from 2.6.18-rc1 clocksource changes"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

John,

On Tue, 2006-07-11 at 17:42 -0700, john stultz wrote:
> > >
> > > Time keeping code reads from a given source and does the necessary
> > > adjustments when NTP is active. There is no relation to an interrupt
> > > event at all. At the very end it boils down to a linear equation which
> > > is recalculated at the synchronization points.
> > >
> > > The timer interrupt itself is not a synchronization point.
> >
> > If the timer interrupt is not a synchronization point, what is?
>
> Synchronization point seems like a bad term in my mind. I prefer to
> think of it as an accumulation point (to avoid clocksource overflows) as
> well as a calculation point where we can make careful adjustments to the
> clocksource frequency. Please note that these two actions are not
> logically linked and could be done separately (although there really
> isn't a need to do so).

I used the term synchronization point for the points on the timeline,
where the information of an external time reference is available, e.g.
NTP data, a PPS interrupt ....

At those points we calculate the deviation from the external time
reference and refine the conversion and adjustment factors. In the
simplest form this boils down to a linear equation where we interpolate
between the points which are defined by the external time reference.

> > The timekeeping doesn't care much where synchronizations point comes from,
> > but it can do a much better job, if it knows when they come.
> > Without knowing this the timekeeping code has to do extra work and has to
> > be either optimistic or assume the worst case, neither is good for precise
> > timekeeping, the first causes extra jitter, the latter very slow
> > adjustments.
>
> This I agree with. But first lets bound the issue to make it more clear
> for everyone: The conflict lies only with the high-precision clocksource
> adjustment code and not so much with the rest of the timekeeping code.
>
> The issue being that the high-precision clocksource adjustment code is
> needed because when NTP makes an adjustment, that very precise
> adjustment might be finer then the smallest multiplier unit adjustment
> that could be made to the clocksource (1/(2^clock->shift)).
>
> To compensate for that, at interrupt time we accumulate the
> high-precision error between what NTP told us to use and what we're able
> to use and store it in the error value. Then we apply extra short-term
> correction when the error grows large enough to keep our long term
> accuracy finer then the (1/2^shift) clocksource resolution.
>
> Without this high-precision adjustment, you would have to rely on NTPd
> to detect and adjust for this granularity difference, which would cause
> a slower, but larger jitter.
>
> Now the problem is, that if interrupts are delayed, this extra short
> term correction might be applied for too long, causing the overshoot
> issue we've seen (which I believe is the "extra jitter" issue mentioned
> above).
>
> However, if we greatly dampen our adjustments, so we are less likely to
> overshoot, this means it will take longer for us to converge (ie: the
> "slow adjustment" issue above).
>
> My only problem here is this: I don't think the slow adjustment issue is
> as severe as claimed. NTPd itself limits its adjustment speed to 500ppm
> and the frequency of its adjustment changes are in the minutes range. So
> I'm not sure I see why damping the error correction so we converge a bit
> more slowly over a period of a second or two is such an issue.
>
> I think this would give a bit of independence between the clocksource
> adjustment code and the interrupt frequency (and likely improve
> robustness as well).

John, that's exactly the central point. In an environment where we have
non periodic interrupts (high resolution timers, dynamic ticks) this
adjustment mechanism which relies on the periodic precise event to do
the accumulation does not work any more. I do not like the idea of
modifying this in a way that the timekeeping code does this fine grained
adjustment on the non periodic timer events. This will be a nightmare of
math and decrease robustness a lot.

The whole concept of doing the fine grained adjustment in order to
compensate for the inaccuracy of the scaled math factors on a _periodic_
event is flawed by design. The latency of interrupts in the kernel and
the fact that the periodic interrupt might be driven by a different
hardware clock, which has a different drift behaviour than the clock
which drives the time source, are reason enough to think about a
solution which makes this interdependency go away completely. In a high
resolution timer / dyntick system and also on virtualized environments
we need to get this dependency removed anyway.

The adjustment code is simply interpolation between points and boils
down to linear equations. Due to the fact that the conversion factor is
not accurate enough we need some mechanism to compensate this. I accept
that the current design has its charm, but I'm quite sure that we can do
a equally precise calculation without the interaction with the timer
interrupt code.

I know you prefer shopping over math, but it is a solvable problem.

tglx

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

Next message: Eric Paris: "[PATCH] Fix security check for joint context= and fscontext= mountoptions"
Previous message: David Miller: "Re: 2.6.18-rc1 fails to boot on Ultra 5"
In reply to: john stultz: "Re: [BUG] APM resume breakage from 2.6.18-rc1 clocksource changes"
Next in thread: john stultz: "Re: [BUG] APM resume breakage from 2.6.18-rc1 clocksource changes"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]