Re: [patch 20/48] x86/apic: Enable TSC coupled programming mode

From: Thomas Gleixner

Date: Tue Mar 03 2026 - 15:25:48 EST


On Tue, Mar 03 2026 at 10:38, Nathan Chancellor wrote:
> On Tue, Mar 03, 2026 at 03:37:03PM +0100, Thomas Gleixner wrote:
>> On Mon, Mar 02 2026 at 18:29, Nathan Chancellor wrote:
>> >
>> > After this change landed in -next as commit f246ec3478cf ("x86/apic:
>> > Enable TSC coupled programming mode"), two of my Intel-based test
>> > machines fail to boot. Unfortunately, I do not think I have any serial
>> > access on these, so I have little introspective ability. Is there any
>> > information I can provide or patches I can test to try and help figure
>> > out what is going on here? I have attached the output of lscpu of both
>> > machines, in case there is some common thread there.
>>
>> Grmbl. I stared at it for a while and I have a suspicion. Can you try
>> the patch below and also provide from one of the machines the output of
>>
>> dmesg | grep -i tsc
>
> This patch works on both machines, so your suspicion seemed spot on.
>
> Output of that dmesg commmand appears to be the same between
> 89f951a1e8ad and f246ec3478cf with that diff applied:
>
> [ 0.000000] tsc: Detected 2500.000 MHz processor
> [ 0.000000] tsc: Detected 2496.000 MHz TSC
> [ 0.008989] TSC deadline timer available
> [ 0.119139] clocksource: tsc-early: mask: 0xffffffffffffffff max_cycles: 0x23fa772cf26, max_idle_ns: 440795269835 ns
> [ 0.312141] clocksource: Switched to clocksource tsc-early
> [ 0.322686] clocksource: tsc: mask: 0xffffffffffffffff max_cycles: 0x23fa772cf26, max_idle_ns: 440795269835 ns
> [ 0.322951] clocksource: Switched to clocksource tsc

Ha! That's exactly what I suspected. What happens is:

TSC-early is installed, which is neither valid for high resolution
timers nor for coupled mode. A bit later TSC is installed with the same
frequency as TSC early. Which means the shift mult pair is not changing,
which then fails to invoke the update of maxns. That stays simply 0, so
the time is always armed for an event in the past and the machine dies
from TSC deadline timer interrupt storm.

On all my test machines TSC frequency is refined against HPET and
installed late and that refinement always changes the shift/mult pair so
I never ran into this situation and obviously did not think about it
either.

Let me write a proper change log and get this into the tip tree.

Thanks for testing!

tglx