[PATCH RFC] x86/tsc: Make recalibration default on for TSC_KNOWN_FREQ cases

From: Feng Tang
Date: Sun May 21 2023 - 23:37:49 EST


Commit a7ec817d5542 ("x86/tsc: Add option to force frequency
recalibration with HW timer") was added to handle cases that the
firmware has bug and provides a wrong TSC frequency number, and it
is optional given that this kind of firmware issue rarely happens
(Paul reported once [1]).

But Rui reported that some Sapphire Rapids platform met this issue
again recently, and as firmware is also a kind of 'software' which
can't be bug free, make the recalibration default on. When the
values from firmware and HW timer's calibration have big gap,
raise a warning and let vendor to check which side is broken.

One downside is, many VMs also has X86_FEATURE_TSC_KNOWN_FREQ set,
and they will also do this recalibration.

[1]. https://lore.kernel.org/lkml/20221117230910.GI4001@paulmck-ThinkPad-P17-Gen-1/
Signed-off-by: Feng Tang <feng.tang@xxxxxxxxx>
---
Documentation/admin-guide/kernel-parameters.txt | 4 ----
arch/x86/kernel/tsc.c | 14 ++++----------
2 files changed, 4 insertions(+), 14 deletions(-)

diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
index 9e5bab29685f..a826ab3b5dfb 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -6451,10 +6451,6 @@
in situations with strict latency requirements (where
interruptions from clocksource watchdog are not
acceptable).
- [x86] recalibrate: force recalibration against a HW timer
- (HPET or PM timer) on systems whose TSC frequency was
- obtained from HW or FW using either an MSR or CPUID(0x15).
- Warn if the difference is more than 500 ppm.
[x86] watchdog: Use TSC as the watchdog clocksource with
which to check other HW timers (HPET or PM timer), but
only on systems where TSC has been deemed trustworthy.
diff --git a/arch/x86/kernel/tsc.c b/arch/x86/kernel/tsc.c
index 344698852146..b77c5b1ad304 100644
--- a/arch/x86/kernel/tsc.c
+++ b/arch/x86/kernel/tsc.c
@@ -48,8 +48,6 @@ static DEFINE_STATIC_KEY_FALSE(__use_tsc);

int tsc_clocksource_reliable;

-static int __read_mostly tsc_force_recalibrate;
-
static u32 art_to_tsc_numerator;
static u32 art_to_tsc_denominator;
static u64 art_to_tsc_offset;
@@ -310,8 +308,6 @@ static int __init tsc_setup(char *str)
__func__);
tsc_as_watchdog = 0;
}
- if (!strcmp(str, "recalibrate"))
- tsc_force_recalibrate = 1;
if (!strcmp(str, "watchdog")) {
if (no_tsc_watchdog)
pr_alert("%s: tsc=watchdog overridden by earlier tsc=nowatchdog\n",
@@ -1395,7 +1391,6 @@ static void tsc_refine_calibration_work(struct work_struct *work)
else
freq = calc_pmtimer_ref(delta, ref_start, ref_stop);

- /* Will hit this only if tsc_force_recalibrate has been set */
if (boot_cpu_has(X86_FEATURE_TSC_KNOWN_FREQ)) {

/* Warn if the deviation exceeds 500 ppm */
@@ -1456,17 +1451,16 @@ static int __init init_tsc_clocksource(void)
clocksource_tsc.flags |= CLOCK_SOURCE_SUSPEND_NONSTOP;

/*
- * When TSC frequency is known (retrieved via MSR or CPUID), we skip
- * the refined calibration and directly register it as a clocksource.
+ * When TSC frequency is known (retrieved via MSR or CPUID), we
+ * directly register it as a clocksource. As the firmware could
+ * be wrong (very unlikely) about the values, the recalibration
+ * by hardware timer is kept just as a sanity check.
*/
if (boot_cpu_has(X86_FEATURE_TSC_KNOWN_FREQ)) {
if (boot_cpu_has(X86_FEATURE_ART))
art_related_clocksource = &clocksource_tsc;
clocksource_register_khz(&clocksource_tsc, tsc_khz);
clocksource_unregister(&clocksource_tsc_early);
-
- if (!tsc_force_recalibrate)
- return 0;
}

schedule_delayed_work(&tsc_irqwork, 0);
--
2.34.1