[PATCH] x86/mce/amd: Filter bogus L3 deferred errors on CZN A0

From: Yazen Ghannam

Date: Sat Feb 28 2026 - 09:09:12 EST


User has observed multiple L3 cache deferred errors logs after recent
kernel rework of deferred error handling. [1]

Upon inspection, the errors are determined to be bogus due to
inconsistent status values. Also, user verified that bogus MCA_DESTAT
values are present on the system even with an older kernel. [2] The
errors seem to be garbage values present in the MCA_DESTAT of some L3
cache banks. These were implicitly ignored before the recent kernel
rework because these do not generate a deferred error interrupt.

A later revision of the rework patch was merged for v6.19. This
naturally filtered out most of the bogus error logs. However, a few
signatures still remain. [3]

Add the remaining bogus signatures to the MCE filter function. Minimize
the scope of the filter to the reported CPU family/model/stepping so
that similar issues are not implicitly masked on other systems.

Fixes: 7cb735d7c0cb ("x86/mce: Unify AMD DFR handler with MCA Polling")
Reported-by: Bert Karwatzki <spasswolf@xxxxxx>
Closes: https://lore.kernel.org/20250915010010.3547-1-spasswolf@xxxxxx
Signed-off-by: Yazen Ghannam <yazen.ghannam@xxxxxxx>
Cc: Mario Limonciello <mario.limonciello@xxxxxxx>
Cc: stable@xxxxxxxxxxxxxxx
Link: https://lore.kernel.org/20250915010010.3547-1-spasswolf@xxxxxx # [1]
Link: https://lore.kernel.org/6e1eda7dd55f6fa30405edf7b0f75695cf55b237.camel@xxxxxx # [2]
Link: https://lore.kernel.org/21ba47fa8893b33b94370c2a42e5084cf0d2e975.camel@xxxxxx # [3]
---
arch/x86/kernel/cpu/mce/amd.c | 12 ++++++++++++
1 file changed, 12 insertions(+)

diff --git a/arch/x86/kernel/cpu/mce/amd.c b/arch/x86/kernel/cpu/mce/amd.c
index da13c1e37f87..7a94492aa50f 100644
--- a/arch/x86/kernel/cpu/mce/amd.c
+++ b/arch/x86/kernel/cpu/mce/amd.c
@@ -604,6 +604,18 @@ bool amd_filter_mce(struct mce *m)
enum smca_bank_types bank_type = smca_get_bank_type(m->extcpu, m->bank);
struct cpuinfo_x86 *c = &boot_cpu_data;

+ /*
+ * Bogus L3 cache deferred errors on Cezanne A0.
+ *
+ * Case #1: PCC bit set. This is not valid for deferred errors.
+ * Case #2: XEC 29. This is not a valid error code.
+ */
+ if (c->x86 == 0x19 && c->x86_model == 0x50 && c->x86_stepping == 0x0 &&
+ bank_type == SMCA_L3_CACHE && (m->status & MCI_STATUS_DEFERRED)) {
+ if ((m->status & MCI_STATUS_PCC) || XEC(m->status, 0x3f) == 29)
+ return true;
+ }
+
/* See Family 17h Models 10h-2Fh Erratum #1114. */
if (c->x86 == 0x17 &&
c->x86_model >= 0x10 && c->x86_model <= 0x2F &&
--
2.53.0