Re: [PATCH] EDAC/amd64: Fix incorrect Node ID mapping on CPU-only Zen4+ systems

From: Phineas Su

Date: Sun Jun 14 2026 - 10:16:21 EST


On Wed, Jun 10, 2026 at 12:07 AM Yazen Ghannam <yazen.ghannam@xxxxxxx> wrote:
>
> On Mon, Jun 08, 2026 at 07:26:07AM +0000, Phineas Su wrote:
> > On CPU-only systems using AMD Zen4 (Genoa) and newer processors, memory
> > ECC errors can be reported with an incorrect Node ID (incremented by 1).
> > This happens because these CPUs use SMCA_UMC_V2 banks, which triggers
> > the GPU node ID fixup logic in fixup_node_id().
> >
>
> Can you please share the MCA_IPID register value for these banks?
>
> The statement 'Zen4 and newer processors...use SMCA_UMC_V2 banks' isn't
> generally true.
>
> This system may be a special case. Or there may be a hardware issue. Or
> there may be a bug in the kernel enumeration code.
>
> Thanks,
> Yazen

Hi Yazen,

Thanks, you are correct. I think we misunderstood the code regarding
Zen4 CPUs generally using UMC_V2.
I was troubleshooting an internal memory error reporting issue on a
CPU-only system and initially thought this might be the problem.
However, I think making the code more robust and clearer is
worthwhile. If a CPU-only system were to report UMC_V2 (due to future
hardware iterations, firmware bugs, or enumeration anomalies), the
current logic would incorrectly apply the GPU fixup.
I will send a v2 of the patch with an updated commit message framing
this as a robustness improvement.

Thanks,
Phineas Su