Re: [REGRESSION] ? system is stuck in clocksource, >60s delay at boot time without tsc=unstable

From: Fab Stz
Date: Tue Feb 25 2025 - 04:07:39 EST


Hello Thomas,

Thank you for the patch! I built the 6.1 kernel with it applied and it apparently works as expected (no delay). Please find logs below & attached dmesg log. Maybe the interesting line is:

Feb 25 08:53:51 debian kernel: tsc: Marking TSC unstable due to TSC halts in idle

Comparison shows that it is "Marking TSC unstable due to boot parameter" with a non patched kernel + tsc=unstable.

+ cat /sys/devices/system/cpu/cpuidle/available_governors
ladder menu
+ cat /sys/devices/system/cpu/cpuidle/current_driver
intel_idle
+ cat /sys/devices/system/cpu/cpuidle/current_governor
menu
+ cat /sys/devices/system/cpu/cpuidle/current_governor_ro
menu
+ ls /sys/devices/system/cpu/cpu0/cpuidle/
state0 state1 state2 state3
+ cat /sys/devices/system/cpu/cpu0/cpuidle/state0/name
POLL
+ cat /sys/devices/system/cpu/cpu0/cpuidle/state0/disable
0
+ cat /sys/devices/system/cpu/cpu0/cpuidle/state1/name
C1_ACPI
+ cat /sys/devices/system/cpu/cpu0/cpuidle/state1/disable
0
+ cat /sys/devices/system/cpu/cpu0/cpuidle/state2/name
C2_ACPI
+ cat /sys/devices/system/cpu/cpu0/cpuidle/state2/disable
0
+ cat /sys/devices/system/cpu/cpu0/cpuidle/state3/name
C3_ACPI
+ cat /sys/devices/system/cpu/cpu0/cpuidle/state3/disable
0


Will the patch also enter the longterm releases like 6.1?

Regards
Fab


Le 24/02/2025 à 09:13, Thomas Gleixner a écrit :
BTW, I tried the "processor.max_cstate=1" you mentioned but it didn't
change anything on the delay and/or warning.

That's weird, but we have no idea what kind of magic the BIOS implements
there for power management behind the kernels back. I assume that it
does because this generation of CPUs uses the ACPI processor idle driver
and that disables TSC when it detects that the system supports
C-states > 1.

Output of these commands can be found in attached file cpuidle.txt

+ cat /sys/devices/system/cpu/cpuidle/current_driver
intel_idle

So according to that the intel_idle driver is in use, which does not
have the magic TSC workarounds like the acpi processor driver has, but
it seems to be loaded preferred.

Sigh. Why is the intel_idle driver so agressive in taking over despite
the fact that it does not handle the old CPUs, which are known to
require the TSC workaround? It handles the APIC stops in C2, but not the
TSC oddity while the original ACPI processor_idle driver does the right
thing for more than two decades....

Can the kernel be patched so that the proper config is used
automatically (ie. without the user having to set any parameter)? I'm
not sure my question actually makes sense.

Yes, we can. Untested patch below. It just brings the intel idle driver
on par with the original ACPI processor idle driver to deal with that
problem.

Thanks,

tglx
---
diff --git a/drivers/idle/intel_idle.c b/drivers/idle/intel_idle.c
index 118fe1d37c22..0fdb1d1316c4 100644
--- a/drivers/idle/intel_idle.c
+++ b/drivers/idle/intel_idle.c
@@ -56,6 +56,7 @@
#include <asm/intel-family.h>
#include <asm/mwait.h>
#include <asm/spec-ctrl.h>
+#include <asm/tsc.h>
#include <asm/fpu/api.h>
#define INTEL_IDLE_VERSION "0.5.1"
@@ -1799,6 +1800,9 @@ static void __init intel_idle_init_cstates_acpi(struct cpuidle_driver *drv)
if (intel_idle_state_needs_timer_stop(state))
state->flags |= CPUIDLE_FLAG_TIMER_STOP;
+ if (cx->type > ACPI_STATE_C1 && !boot_cpu_has(X86_FEATURE_NONSTOP_TSC))
+ mark_tsc_unstable("TSC halts in idle");
+
state->enter = intel_idle;
state->enter_s2idle = intel_idle_s2idle;
}

Feb 25 08:53:50 debian kernel: Command line: root=UUID=462f57b4-136e-4c18-8c55-e4bc59cfb7aa ro zswap.enabled=1 mem_sleep_default=deep single initrd=\boot\initrd.img-6.1.0-0.a.test-amd64
Feb 25 08:53:50 debian kernel: tsc: Fast TSC calibration using PIT
Feb 25 08:53:50 debian kernel: tsc: Detected 2653.363 MHz processor
Feb 25 08:53:50 debian kernel: clocksource: refined-jiffies: mask: 0xffffffff max_cycles: 0xffffffff, max_idle_ns: 7645519600211568 ns
Feb 25 08:53:50 debian kernel: Kernel command line: root=UUID=462f57b4-136e-4c18-8c55-e4bc59cfb7aa ro zswap.enabled=1 mem_sleep_default=deep single initrd=\boot\initrd.img-6.1.0-0.a.test-amd64
Feb 25 08:53:50 debian kernel: Unknown kernel command line parameters "single", will be passed to user space.
Feb 25 08:53:50 debian kernel: clocksource: hpet: mask: 0xffffffff max_cycles: 0xffffffff, max_idle_ns: 76450417870 ns
Feb 25 08:53:50 debian kernel: clocksource: tsc-early: mask: 0xffffffffffffffff max_cycles: 0x263f25ec687, max_idle_ns: 440795217651 ns
Feb 25 08:53:50 debian kernel: clocksource: jiffies: mask: 0xffffffff max_cycles: 0xffffffff, max_idle_ns: 7645041785100000 ns
Feb 25 08:53:51 debian kernel: clocksource: Switched to clocksource tsc-early
Feb 25 08:53:51 debian kernel: clocksource: acpi_pm: mask: 0xffffff max_cycles: 0xffffff, max_idle_ns: 2085701024 ns
Feb 25 08:53:51 debian kernel: tsc: Refined TSC clocksource calibration: 2653.335 MHz
Feb 25 08:53:51 debian kernel: clocksource: tsc: mask: 0xffffffffffffffff max_cycles: 0x263f0bda6a8, max_idle_ns: 440795254345 ns
Feb 25 08:53:51 debian kernel: clocksource: Switched to clocksource tsc
Feb 25 08:53:51 debian kernel: tsc: Marking TSC unstable due to TSC halts in idle
Feb 25 08:53:51 debian kernel: clocksource: Checking clocksource tsc synchronization from CPU 1 to CPUs 0.
Feb 25 08:53:51 debian kernel: clocksource: Switched to clocksource hpet
on the kernel command line