Re: [PATCH] x86/mce: Restore MCA polling interval halving
From: Borislav Petkov
Date: Thu Apr 23 2026 - 08:55:03 EST
On Tue, Apr 21, 2026 at 03:49:06PM +0000, Zhuo, Qiuxu wrote:
> Yes, I think so.
>
> So, to me, the current rough and simple method for catching frequent error cases is a good
> trade-off between accuracy and complexity.
>
> Reviewed-by: Qiuxu Zhuo <qiuxu.zhuo@xxxxxxxxx>
> Tested-by: Qiuxu Zhuo <qiuxu.zhuo@xxxxxxxxx>
Thanks!
I have this now - I will hammer on it some more before I queue it.
---
Author: Borislav Petkov (AMD) <bp@xxxxxxxxx>
Date: Mon Mar 16 16:12:00 2026 +0100
x86/mce: Restore MCA polling interval halving
RongQing reported that the MCA polling interval doesn't halve when an
error gets logged. It was traced down to the commit in Fixes:, because:
mce_timer_fn()
|-> mce_poll_banks()
|-> machine_check_poll()
|-> mce_log()
which will queue the work and return.
Now, back in mce_timer_fn():
/*
* Alert userspace if needed. If we logged an MCE, reduce the polling
* interval, otherwise increase the polling interval.
*/
if (mce_notify_irq())
<--- here we haven't ran the notifier chain yet so mce_need_notify is
not set yet so this won't hit and we won't halve the interval iv.
Now the notifier chain runs. mce_early_notifier() sets the bit, does
mce_notify_irq(), that clears the bit and then the notifier chain
a little later logs the error.
So this is a silly timing issue.
But, that's all unnecessary.
All it needs to happen here is, the "should we notify of a logged MCE"
mce_notify_irq() asks, should be simply a question to the mce gen pool:
"Are you empty?"
And that then turns into a simple yes or no answer and it all
JustWorks(tm).
So do that and also distribute the functionality where it belongs:
- Print that MCE events have been logged in mce_log()
- Trigger the mcelog tool specific work in the first notifier
As a result, mce_notify_irq() can go now.
Fixes: 011d82611172 ("RAS: Add a Corrected Errors Collector")
Reported-by: Li RongQing <lirongqing@xxxxxxxxx>
Signed-off-by: Borislav Petkov (AMD) <bp@xxxxxxxxx>
Reviewed-by: Qiuxu Zhuo <qiuxu.zhuo@xxxxxxxxx>
Tested-by: Qiuxu Zhuo <qiuxu.zhuo@xxxxxxxxx>
Link: https://lore.kernel.org/r/20260112082747.2842-1-lirongqing@xxxxxxxxx
diff --git a/arch/x86/kernel/cpu/mce/core.c b/arch/x86/kernel/cpu/mce/core.c
index 8dd424ac5de8..f3a793e3a6c8 100644
--- a/arch/x86/kernel/cpu/mce/core.c
+++ b/arch/x86/kernel/cpu/mce/core.c
@@ -90,7 +90,6 @@ struct mca_config mca_cfg __read_mostly = {
};
static DEFINE_PER_CPU(struct mce_hw_err, hw_errs_seen);
-static unsigned long mce_need_notify;
/*
* MCA banks polled by the period polling timer for corrected events.
@@ -152,8 +151,10 @@ EXPORT_PER_CPU_SYMBOL_GPL(injectm);
void mce_log(struct mce_hw_err *err)
{
- if (mce_gen_pool_add(err))
+ if (mce_gen_pool_add(err)) {
+ pr_info(HW_ERR "Machine check events logged\n");
irq_work_queue(&mce_irq_work);
+ }
}
EXPORT_SYMBOL_GPL(mce_log);
@@ -585,28 +586,6 @@ bool mce_is_correctable(struct mce *m)
}
EXPORT_SYMBOL_GPL(mce_is_correctable);
-/*
- * Notify the user(s) about new machine check events.
- * Can be called from interrupt context, but not from machine check/NMI
- * context.
- */
-static bool mce_notify_irq(void)
-{
- /* Not more than two messages every minute */
- static DEFINE_RATELIMIT_STATE(ratelimit, 60*HZ, 2);
-
- if (test_and_clear_bit(0, &mce_need_notify)) {
- mce_work_trigger();
-
- if (__ratelimit(&ratelimit))
- pr_info(HW_ERR "Machine check events logged\n");
-
- return true;
- }
-
- return false;
-}
-
static int mce_early_notifier(struct notifier_block *nb, unsigned long val,
void *data)
{
@@ -618,9 +597,7 @@ static int mce_early_notifier(struct notifier_block *nb, unsigned long val,
/* Emit the trace record: */
trace_mce_record(err);
- set_bit(0, &mce_need_notify);
-
- mce_notify_irq();
+ mce_work_trigger();
return NOTIFY_DONE;
}
@@ -1804,7 +1781,7 @@ static void mce_timer_fn(struct timer_list *t)
* Alert userspace if needed. If we logged an MCE, reduce the polling
* interval, otherwise increase the polling interval.
*/
- if (mce_notify_irq())
+ if (!mce_gen_pool_empty())
iv = max(iv / 2, (unsigned long) HZ/100);
else
iv = min(iv * 2, round_jiffies_relative(check_interval * HZ));
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette