Re: [suse-amd64] False "lost ticks" on dual-Opteron system (=> timer twice as fast)

From: Bernd Paysan
Date: Tue May 10 2005 - 06:39:47 EST


On Tuesday 10 May 2005 13:12, Andi Kleen wrote:
> > So that explains why nobody sees this problem. But the TSC-based
> > fallback timekeeping is still broken on SMP systems with PowerNow and
> > distributed IRQ handling, which both together seem to be rare enough
> > ;-).
>
> There is a patch pending for the TSC problem - using the pmtimer instead
> in this case.
>
> But the distributed timer interrupt problem is weird. It should not
> happen. You sure it was IRQ 0 that was duplicated and not "LOC" ?

Yes. Only one CPU actually gets and handles the timer interrupt, but which
one is somewhat random (for about 10 seconds, it's the same CPU, then it
switches over).

> When you watch -n1 cat /proc/interrupts does the rate roughly match
> up to 1000Hz?

Yes, and this is confirmed over longer time:

# grep timer /proc/interrupts; uptime
0: 40347440 40582285 IO-APIC-edge timer
1:26pm an 22:28, 1 user, Durchschnittslast: 0,00, 0,01, 0,04
# echo $[(3600*22+28*60)*1000] $[40347440+40582285]
80880000 80929725

Given that uptime is only accurate to the minute, this sounds very
reasonable. The distribution also is close to 50:50. That's (almost) true
for all interrupt sources:

# cat /proc/interrupts
CPU0 CPU1
0: 40523846 40753939 IO-APIC-edge timer
1: 3 189 IO-APIC-edge i8042
8: 261 280 IO-APIC-edge rtc
9: 0 0 IO-APIC-level acpi
15: 364369 364479 IO-APIC-edge ide1
169: 59195 55498 IO-APIC-level 3w-9xxx
177: 618198 604643 IO-APIC-level 3w-9xxx
185: 8195891 8147619 IO-APIC-level aic79xx, eth1
193: 0 30 IO-APIC-level aic79xx
201: 0 0 IO-APIC-level ohci_hcd, ohci_hcd
NMI: 1184 1013
LOC: 81273966 81271958
ERR: 0
MIS: 0

--
Bernd Paysan
http://www.mikron.de/

Attachment: pgp00000.pgp
Description: PGP signature