Re: [PATCH v2] EDAC/amd64: Only translate Node ID if GPU nodes are present

From: Yazen Ghannam

Date: Mon Jun 22 2026 - 13:08:01 EST


On Sun, Jun 14, 2026 at 03:06:18PM +0000, Phineas Su wrote:
> The fixup_node_id() function adjusts the AMD Node ID for GPU memory
> controllers (which report as SMCA_UMC_V2) on heterogeneous systems.
> CPU memory controllers typically report as SMCA_UMC (v1) and do not
> require translation.
>
> If a CPU memory controller on a CPU-only system were to report as
> SMCA_UMC_V2 (e.g. due to future hardware iterations, firmware reporting,
> or kernel enumeration differences), the code would attempt to apply the
> GPU translation. Since gpu_node_map is uninitialized on CPU-only systems,
> this would lead to an unintended translation being applied.
>
> Harden the translation logic by checking that GPU nodes are actually
> present in the system (gpu_node_map.node_count > 0) before applying the
> translation. This ensures CPU-only systems are always safely bypassed.
>
> Signed-off-by: Phineas Su <pohaosu@xxxxxxxxxx>
> ---
> v2:
> - Reframed commit message to focus on robustness rather than a confirmed
> bug, as Zen4 CPUs do not generally report UMC_V2. (Yazen)
>
> drivers/edac/amd64_edac.c | 4 ++++
> 1 file changed, 4 insertions(+)
>
> diff --git a/drivers/edac/amd64_edac.c b/drivers/edac/amd64_edac.c
> index c6aa69dbd9fb..1e688123a50c 100644
> --- a/drivers/edac/amd64_edac.c
> +++ b/drivers/edac/amd64_edac.c
> @@ -1047,6 +1047,10 @@ static int fixup_node_id(int node_id, struct mce *m)
> if (smca_get_bank_type(m->extcpu, m->bank) != SMCA_UMC_V2)
> return node_id;
>
> + /* If no GPU nodes are present, no fixup is needed. */
> + if (!gpu_node_map.node_count)
> + return node_id;
> +

Thanks Phineas for the patch.

I do agree, in principle, about hardening against possible new
configurations.

I think your patch could go a bit further.

I don't see any users of gpu_node_map.node_count. So rather than add a
user, we could remove it.

In fact, we don't even need the entire struct. We just need the
'gpu_base_node_id'.

> /* Nodes below the GPU base node are CPU nodes and don't need a fixup. */
> if (nid < gpu_node_map.base_node_id)
> return node_id;

You could add the "unintialized" check here. ^^^

if (!gpu_base_node_id || nid < gpu_base_node_id)
return node_id;

What do you think?

Thanks,
Yazen