Re: [PATCH] x86/mce: Fix timer interval adjustment after logging a MCE event

From: Borislav Petkov

Date: Mon Feb 02 2026 - 10:25:14 EST


On Wed, Jan 14, 2026 at 03:48:13PM +0100, Borislav Petkov wrote:
> Now on to find what causes this. Even if we can't find the proper commit,
> I guess testing 6.18 and 6.12 - the LTS kernels - should be good enough as to
> backport a fix there.

Ok, finally back to staring at this.

Looks like adding this:

diff --git a/arch/x86/kernel/cpu/mce/core.c b/arch/x86/kernel/cpu/mce/core.c
index 34440021e8cf..b94efe5950c4 100644
--- a/arch/x86/kernel/cpu/mce/core.c
+++ b/arch/x86/kernel/cpu/mce/core.c
@@ -154,6 +154,8 @@ void mce_log(struct mce_hw_err *err)
{
if (mce_gen_pool_add(err))
irq_work_queue(&mce_irq_work);
+
+ set_bit(0, &mce_need_notify);
}
EXPORT_SYMBOL_GPL(mce_log);

makes the interval halve again, see below for the timestamps.

I guess I'll do a proper patch from the hunk here:

https://lore.kernel.org/r/20260113224152.GVaWbKMMzManQ5WwlT@fat_crate.local

along with 6.12 and 6.18 backports and see whether that's a good enough as
a stable fix too.

Thx.

[ 316.795248] mce: [Hardware Error]: Machine check events logged
[ 316.795262] mce: [Hardware Error]: Machine check events logged
[ 316.798331] mce: [Hardware Error]: CPU 0: Machine Check: 0 Bank 4: 9c2041000000011b
[ 316.800104] mce: [Hardware Error]: TSC 0 ADDR 6d3d483b
[ 316.801442] mce: [Hardware Error]: PROCESSOR 2:800f82 TIME 1770040950 SOCKET 0 APIC 0 microcode 800820d
[ 628.091492] mce: [Hardware Error]: Machine check events logged
[ 628.091515] mce: [Hardware Error]: Machine check events logged
[ 628.097216] mce: [Hardware Error]: CPU 0: Machine Check: 0 Bank 4: 9c2041000000011b
[ 628.101393] mce: [Hardware Error]: TSC 0 ADDR 6d3d483b
[ 628.103992] mce: [Hardware Error]: PROCESSOR 2:800f82 TIME 1770041262 SOCKET 0 APIC 0 microcode 800820d

<--- it starts decreasing the interval here.

[ 825.581354] hrtimer: interrupt took 18820 ns
[ 939.387367] mce: [Hardware Error]: Machine check events logged
[ 939.390185] mce: [Hardware Error]: Machine check events logged
[ 939.392936] mce: [Hardware Error]: CPU 0: Machine Check: 0 Bank 4: 9c2041000000011b
[ 939.396465] mce: [Hardware Error]: TSC 0 ADDR 6d3d483b
[ 939.399042] mce: [Hardware Error]: PROCESSOR 2:800f82 TIME 1770041573 SOCKET 0 APIC 0 microcode 800820d
[ 1103.227402] mce: [Hardware Error]: Machine check events logged
[ 1103.230267] mce: [Hardware Error]: Machine check events logged
[ 1103.233018] mce: [Hardware Error]: CPU 0: Machine Check: 0 Bank 4: 9c2041000000011b
[ 1103.236565] mce: [Hardware Error]: TSC 0 ADDR 6d3d483b
[ 1103.239146] mce: [Hardware Error]: PROCESSOR 2:800f82 TIME 1770041737 SOCKET 0 APIC 0 microcode 800820d
[ 1179.003479] mce: [Hardware Error]: Machine check events logged
[ 1179.006452] mce: [Hardware Error]: Machine check events logged
[ 1179.009144] mce: [Hardware Error]: CPU 0: Machine Check: 0 Bank 4: 9c2041000000011b
[ 1179.012757] mce: [Hardware Error]: TSC 0 ADDR 6d3d483b
[ 1179.015338] mce: [Hardware Error]: PROCESSOR 2:800f82 TIME 1770041813 SOCKET 0 APIC 0 microcode 800820d
[ 1217.915386] mce: [Hardware Error]: CPU 0: Machine Check: 0 Bank 4: 9c2041000000011b
[ 1217.919088] mce: [Hardware Error]: TSC 0 ADDR 6d3d483b
[ 1217.921662] mce: [Hardware Error]: PROCESSOR 2:800f82 TIME 1770041852 SOCKET 0 APIC 0 microcode 800820d
[ 1238.395440] mce: [Hardware Error]: CPU 0: Machine Check: 0 Bank 4: 9c2041000000011b
[ 1238.399041] mce: [Hardware Error]: TSC 0 ADDR 6d3d483b
[ 1238.401619] mce: [Hardware Error]: PROCESSOR 2:800f82 TIME 1770041872 SOCKET 0 APIC 0 microcode 800820d
[ 1269.115368] mce_notify_irq: 4 callbacks suppressed
[ 1269.117829] mce: [Hardware Error]: Machine check events logged
[ 1269.120586] mce: [Hardware Error]: Machine check events logged
[ 1269.123412] mce: [Hardware Error]: CPU 0: Machine Check: 0 Bank 4: 9c2041000000011b
[ 1269.126950] mce: [Hardware Error]: TSC 0 ADDR 6d3d483b
[ 1269.129511] mce: [Hardware Error]: PROCESSOR 2:800f82 TIME 1770041903 SOCKET 0 APIC 0 microcode 800820d

and then it started enlarging it again when I changed the injection interval
to 300s.

[ 1578.363408] mce: [Hardware Error]: Machine check events logged
[ 1578.366346] mce: [Hardware Error]: Machine check events logged
[ 1578.369174] mce: [Hardware Error]: CPU 0: Machine Check: 0 Bank 4: 9c2041000000011b
[ 1578.372742] mce: [Hardware Error]: TSC 0 ADDR 6d3d483b
[ 1578.375226] mce: [Hardware Error]: PROCESSOR 2:800f82 TIME 1770042212 SOCKET 0 APIC 0 microcode 800820d
[ 2119.035460] mce: [Hardware Error]: Machine check events logged
[ 2119.038432] mce: [Hardware Error]: Machine check events logged
[ 2119.041236] mce: [Hardware Error]: CPU 0: Machine Check: 0 Bank 4: 9c2041000000011b
[ 2119.044846] mce: [Hardware Error]: TSC 0 ADDR 6d3d483b
[ 2119.047340] mce: [Hardware Error]: PROCESSOR 2:800f82 TIME 1770042753 SOCKET 0 APIC 0 microcode 800820d
[ 2282.875491] mce: [Hardware Error]: Machine check events logged
[ 2282.878409] mce: [Hardware Error]: Machine check events logged
[ 2282.881277] mce: [Hardware Error]: CPU 0: Machine Check: 0 Bank 4: 9c2041000000011b
[ 2282.884978] mce: [Hardware Error]: TSC 0 ADDR 6d3d483b
[ 2282.887482] mce: [Hardware Error]: PROCESSOR 2:800f82 TIME 1770042917 SOCKET 0 APIC 0 microcode 800820d
[ 2512.251516] mce: [Hardware Error]: Machine check events logged
[ 2512.254371] mce: [Hardware Error]: Machine check events logged
[ 2512.257261] mce: [Hardware Error]: CPU 0: Machine Check: 0 Bank 4: 9c2041000000011b
[ 2512.260841] mce: [Hardware Error]: TSC 0 ADDR 6d3d483b
[ 2512.263406] mce: [Hardware Error]: PROCESSOR 2:800f82 TIME 1770043146 SOCKET 0 APIC 0 microcode 800820d

--
Regards/Gruss,
Boris.

https://people.kernel.org/tglx/notes-about-netiquette