Re: [PATCH] EDAC/mce_amd: Fix Hygon UMC ECC error decoding with logical_die_id

From: Yazen Ghannam

Date: Mon Feb 16 2026 - 15:32:28 EST


On Sat, Feb 14, 2026 at 02:42:03PM +0800, Aichun Shi wrote:
> cpuinfo_topology.amd_node_id is populated via CPUID or MSR, as introduced
> by commit f7fb3b2dd92c ("x86/cpu: Provide an AMD/HYGON specific topology
> parser") and commit 03fa6bea5a3e ("x86/cpu: Make topology_amd_node_id()
> use the actual node info"). However, this value may be non-continuous for
> Hygon processors while EDAC uses continuous node IDs, which leads to
> incorrect UMC ECC error decoding.

Can you please share an example?

>
> In contract, cpuinfo_topology.logical_die_id always provides continuous
> die (or node) IDs. Fix this by replacing topology_amd_node_id() with
> topology_logical_die_id() when decoding UMC ECC errors for Hygon
> processors.
>
> Signed-off-by: Aichun Shi <shiaichun@xxxxxxxxxxxxxx>
> ---
> drivers/edac/mce_amd.c | 9 +++++++--
> 1 file changed, 7 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/edac/mce_amd.c b/drivers/edac/mce_amd.c
> index af3c12284a1e..4a23c1d6488e 100644
> --- a/drivers/edac/mce_amd.c
> +++ b/drivers/edac/mce_amd.c
> @@ -746,8 +746,13 @@ static void decode_smca_error(struct mce *m)
> pr_emerg(HW_ERR "%s Ext. Error Code: %d", smca_get_long_name(bank_type), xec);
>
> if ((bank_type == SMCA_UMC || bank_type == SMCA_UMC_V2) &&
> - xec == 0 && decode_dram_ecc)
> - decode_dram_ecc(topology_amd_node_id(m->extcpu), m);
> + xec == 0 && decode_dram_ecc) {
> + if (boot_cpu_data.x86_vendor == X86_VENDOR_HYGON &&
> + boot_cpu_data.x86 == 0x18)

Is the family check necessary? You did not mention a specific family in
the commit message. So it seems the intent is to apply to all Hygon
systems.

Thanks,
Yazen