Re: Regression in 2.6.27 caused by commit bfc0f59

From: Bill Davidsen
Date: Tue Sep 02 2008 - 13:03:22 EST


Linus Torvalds wrote:

On Tue, 2 Sep 2008, Thomas Gleixner wrote:
We had the same problem versus the local APIC timer calibration, which
had basically the same algorithm as the TSC one and we changed it to
look at the PMTimer as well in the days where we debugged the initial
wreckage caused by the nohz/highres changes.

Hmm.

So then how would you discover when it's reliable and when it's not? Just hardcode it for certain machines?

Looking at values for old K6 machines, I would suspect that doing the test three times and checking the deviation would be enough. If the timer is emulated the value will jump around and if it is stable it could be used. Considering that this is one use code you could increase the number of trials to five or so, keeping the high and low. If changing values are part of the problem, make them part of the solution.

One alternative might be to do the same "detect if it's SMM code by seeing how long the read takes" for the PIT reads themselves. Right now the code does it for the HPET timer read and for the PM_TIMER reads, but _not_ for the PIT status register reads.

How do you prevent the SMM brain damage, when it hits 3 times in a row ?

Well, the biggest problem is actually _detection_.

We have three different timers, and they all have their own problems. How do you reliably detect which one to use? The PM_TIMER clearly is _not_ always the answer here, but the code just assumes it is!

Linus


--
Bill Davidsen <davidsen@xxxxxxx>
"We have more to fear from the bungling of the incompetent than from
the machinations of the wicked." - from Slashdot

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/