Re: [PATCH] x86/tsc: Add option to force HW timer based recalibration

From: Feng Tang
Date: Mon May 09 2022 - 07:23:09 EST


Hi Thomas,

Thanks for the comments!

On Mon, May 09, 2022 at 12:01:42PM +0200, Thomas Gleixner wrote:
> Feng,
>
> On Mon, May 09 2022 at 15:30, Feng Tang wrote:
> > On Mon, May 09, 2022 at 09:16:52AM +0200, Peter Zijlstra wrote:
> >> On Mon, May 09, 2022 at 12:58:39PM +0800, Feng Tang wrote:
> >> > And there is still very few corner case that the freq info is not
> >> > accurate enough with small deviation from the actual value, like on
> >> > a product with early buggy version of firmware or on some
> >> > pre-production hardware.
> >> >
> >> > Add an option 'recalibrate' for 'tsc' kernel parameter to force the
> >> > tsc freq recalibration with HPET/PM_TIMER, and warn if the deviation
> >> > from previous value is more than about 500 PPM.
> >> >
> >> > Signed-off-by: Feng Tang <feng.tang@xxxxxxxxx>
> >>
> >> Why isn't 'tsc_early_khz=' not working for you? Afaict that will
> >> override calibrate_tsc() when provided and as such can be used on these
> >> early platforms for provide the right value until such time that the
> >> firmware is fixed.
> >
> > For the early platforms, the problem we met is we don't know what
> > is the 'correct' tsc-freq, and the value from MSR/CUPID could be wrong.
> >
> > And there was some generation, that after enabling some feature, each
> > instance of HW will have slightly different frequency, so there is
> > no central "one for all" value to set for 'tsc_early_khz'.
> >
> > This option is more like a way to double-check the correctness of
> > tsc-freq got from MSR/CPUID(0x15).
>
> If at all it's a workaround for the inability and ignorance of firmware
> people. The crystal frequency and the TSC/crystal ratio _are_ known to
> the system designer and firmware people. It's really not asked too much
> to put the correct values into CPUID(0x15) and have proper quality
> control to ensure the correctness.
>
> The whole argument about early firmware and pre-production hardware is
> bogus. It's 2022 and we are debating this problem for more than a decade
> now and still hardware and firmware people think they can do what they
> want and it all can be "fixed" in software. It's not rocket science to
> get this straight.

I completely understand it, as we've also suffered a lot from such
problems. This patch doesn't change any current work flow, and it simply
calibrates and prints out the freq info (warn if there is big deviation).
It acctually provides SW guys a quick way to argue with HW/FW people:
"See! You've given us a wrong number, please fix it", otherwise I heard
there was customer long ago who used atomic clock to prove the deviation.

> Aside of that HPET has become unrealiable and PM timer is not guaranteed
> to be there either. So we really do not need a mechanism to enforce
> recalibration against something which is not guaranteed to provide
> sensible information.

Correct. The HPET on new client platforms turn to be disabled for the
PC10 issue, though it's fine on server platforms where tsc accuracy is
more important. Also even for the disabled HPET case, I remembered that
you've once suggested to leverage its capability for calibration, and
only disable it before cpu idle framework really starts :)

Thanks,
Feng

> Thanks,
>
> tglx