Re: large, spurious[?] TSC skews on AMD 760MPX boards

From: Monty
Date: Tue Jul 27 2004 - 17:35:17 EST





On Tue, Jul 27, 2004 at 07:26:04PM +0200, Andi Kleen wrote:
> xiphmont@xxxxxxxx (Monty) writes:
>
> > Ever since getting my first dual Athlon, the system timer was 'not
> > quite right' when running at stock speed. Selects, alarms, etc, had a
> > strange way of firing fractions of a second or several seconds 'too
> > late'. I discovered that overclocking by about 10% made the problem
>
> That points away from the TSC actually. select and alarm use the jiffies
> clock, which is managed by the PIT timer in the southbridge. AFAIK
> they never rely on the TSC.

Although I believe you, the timer problem exists only when boot
reports the TSC skew.

> Assuming it is the TSC:
>
> You could write a multithreaded program that polls the TSCs
> on your both CPU for a long time and check out the drift rate.
> The kernel will try to fix it at boot time, but it cannot do that when the TSCs
> are drifting later.

Drift doesn't seem to be a problem; if the system boots without the
'skew' message, I have no timer difficulties even if the box is up for
months. If the system boots with a skew message, not a single
timer-based op on the machine seems to work ever; I can't watch
movies, play games or anything. I'll get a few frames, a freeze for
several seconds, a few seconds of frames, freeze for several seconds,
a frame or two, more freeze, etc... This appears to be related to
processor affinity (when the process gets bounced to the other CPU,
the timers appear to just freeze for a while or stop entirely).

> One way to work around it would be to boot with "notsc". This will
> make your gettimeofday() slower and more inaccurate though.

I will try that and report back.

> Assuming it is not:
>
> Something is wrong with your PIT timer in the southbridge. Maybe
> just run ntpd ?

I do run ntpd. My problem and concern is primarily with sub-second
timers having a granularity of several seconds.

> I know that later AMD chipsets - in particular the 8111 - are somewhat
> bad time keepers, which makes it a good idea to run NTP always.

The bug is all or nothing. Without the bootup skew report, the
machine runs flawlessly indefinitely.

Monty
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/