RE: [PATCH 4/7] EDAC/mce_amd: extract node id from InstanceHi in IPID
From: Chatradhi, Naveen Krishna
Date: Tue Aug 10 2021 - 08:45:30 EST
[Public]
Hi Yazen
-----Original Message-----
From: Ghannam, Yazen <Yazen.Ghannam@xxxxxxx>
Sent: Thursday, July 29, 2021 10:03 PM
To: Chatradhi, Naveen Krishna <NaveenKrishna.Chatradhi@xxxxxxx>
Cc: linux-edac@xxxxxxxxxxxxxxx; x86@xxxxxxxxxx; linux-kernel@xxxxxxxxxxxxxxx; bp@xxxxxxxxx; mingo@xxxxxxxxxx; mchehab@xxxxxxxxxx
Subject: Re: [PATCH 4/7] EDAC/mce_amd: extract node id from InstanceHi in IPID
On Wed, Jun 30, 2021 at 08:58:25PM +0530, Naveen Krishna Chatradhi wrote:
> On AMD systems with SMCA banks on NONCPU nodes, the node id
> information is available in the InstanceHI[47:44] of the IPID register.
The doesn't read well to me. I saw this as saying "bits 47:44 of the InstanceHi register". Also, the name of the field is "InstanceIdHi" in the documentation.
I think it'd be more clear to say "available in MCA_IPID[47:44] (InstanceIdHi)" or something similar.
[naveenk:] Modified the commit message
>
> Signed-off-by: Muralidhara M K <muralimk@xxxxxxx>
> Signed-off-by: Naveen Krishna Chatradhi <nchatrad@xxxxxxx>
> ---
> drivers/edac/mce_amd.c | 15 +++++++++++++--
> 1 file changed, 13 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/edac/mce_amd.c b/drivers/edac/mce_amd.c index
> 27d56920b469..364dfb6e359d 100644
> --- a/drivers/edac/mce_amd.c
> +++ b/drivers/edac/mce_amd.c
> @@ -1049,6 +1049,7 @@ static void decode_smca_error(struct mce *m)
> enum smca_bank_types bank_type;
> const char *ip_name;
> u8 xec = XEC(m->status, xec_mask);
> + u32 node_id = 0;
Why u32? Why not u16 to match topology_die_id() or int to match decode_dram_ecc()?
[naveenk:] Done, used int.
>
> if (m->bank >= ARRAY_SIZE(smca_banks))
> return;
> @@ -1072,8 +1073,18 @@ static void decode_smca_error(struct mce *m)
> if (xec < smca_mce_descs[bank_type].num_descs)
> pr_cont(", %s.\n", smca_mce_descs[bank_type].descs[xec]);
>
> - if (bank_type == SMCA_UMC && xec == 0 && decode_dram_ecc)
> - decode_dram_ecc(topology_die_id(m->extcpu), m);
> + /*
> + * SMCA_UMC_V2 is used on the noncpu nodes, extract the node id
> + * from the InstanceHI[47:44] of the IPID register.
> + */
> + if (bank_type == SMCA_UMC_V2 && xec == 0)
> + node_id = ((m->ipid >> 44) & 0xF);
> +
> + if (bank_type == SMCA_UMC && xec == 0)
> + node_id = topology_die_id(m->extcpu);
> +
> + if (decode_dram_ecc)
> + decode_dram_ecc(node_id, m);
If decode_dram_ecc() is set, then this will call it on every MCA error that comes in. Rather we only want to call it on DRAM ECC errors.
You could do something like this:
if (decode_dram_ecc && xec == 0) {
u32 node_id = 0;
if (bank_type == SMCA_UMC)
node_id = XXX;
else if (bank_type == SMCA_UMC_V2)
node_id = YYY;
else
return;
decode_dram_ecc(node_id, m);
}
This is just an example. Maybe you can save an indentation level by negating those conditions and returning early, etc.
[naveenk:] modified the ladder.
Thanks,
Yazen
[naveenk:] Thank you.