[PATCH 5/6] EDAC, mce_amd: Print syndrome register value on SMCA systems

From: Borislav Petkov
Date: Fri Jul 08 2016 - 05:10:23 EST


From: Yazen Ghannam <Yazen.Ghannam@xxxxxxx>

Print SyndV bit status and print the raw value of the MCA_SYND register.
Further decoding of the syndrome from struct mce.synd can be done in
other places where appropriate, e.g. DRAM ECC.

Boris: make the error stanza more compact by putting the error address
and syndrome on the same line:

[Hardware Error]: Corrected error, no action required.
[Hardware Error]: CPU:2 (17:0:0) MC4_STATUS[-|CE|-|PCC|AddrV|-|-|SyndV|CECC]: 0x96204100001e0117
[Hardware Error]: Error Addr: 0x000000007f4c52e3, Syndrome: 0x0000000000000000
[Hardware Error]: Invalid IP block specified.
[Hardware Error]: cache level: L3/GEN, tx: DATA, mem-tx: RD

Signed-off-by: Yazen Ghannam <Yazen.Ghannam@xxxxxxx>
Cc: Aravind Gopalakrishnan <aravindksg.lkml@xxxxxxxxx>
Cc: Tony Luck <tony.luck@xxxxxxxxx>
Cc: linux-edac <linux-edac@xxxxxxxxxxxxxxx>
Link: http://lkml.kernel.org/r/1467633035-32080-2-git-send-email-Yazen.Ghannam@xxxxxxx
Signed-off-by: Borislav Petkov <bp@xxxxxxx>
---
drivers/edac/mce_amd.c | 14 +++++++++++---
1 file changed, 11 insertions(+), 3 deletions(-)

diff --git a/drivers/edac/mce_amd.c b/drivers/edac/mce_amd.c
index 9b6800a79c7f..057ece577800 100644
--- a/drivers/edac/mce_amd.c
+++ b/drivers/edac/mce_amd.c
@@ -927,7 +927,7 @@ static void decode_smca_errors(struct mce *m)
size_t len;

if (rdmsr_safe(addr, &low, &high)) {
- pr_emerg("Invalid IP block specified, error information is unreliable.\n");
+ pr_emerg(HW_ERR "Invalid IP block specified.\n");
return;
}

@@ -1078,6 +1078,8 @@ int amd_decode_mce(struct notifier_block *nb, unsigned long val, void *data)
u32 low, high;
u32 addr = MSR_AMD64_SMCA_MCx_CONFIG(m->bank);

+ pr_cont("|%s", ((m->status & MCI_STATUS_SYNDV) ? "SyndV" : "-"));
+
if (!rdmsr_safe(addr, &low, &high) &&
(low & MCI_CONFIG_MCAX))
pr_cont("|%s", ((m->status & MCI_STATUS_TCC) ? "TCC" : "-"));
@@ -1091,12 +1093,18 @@ int amd_decode_mce(struct notifier_block *nb, unsigned long val, void *data)
pr_cont("]: 0x%016llx\n", m->status);

if (m->status & MCI_STATUS_ADDRV)
- pr_emerg(HW_ERR "MC%d Error Address: 0x%016llx\n", m->bank, m->addr);
+ pr_emerg(HW_ERR "Error Addr: 0x%016llx", m->addr);

if (boot_cpu_has(X86_FEATURE_SMCA)) {
+ if (m->status & MCI_STATUS_SYNDV)
+ pr_cont(", Syndrome: 0x%016llx", m->synd);
+
+ pr_cont("\n");
+
decode_smca_errors(m);
goto err_code;
- }
+ } else
+ pr_cont("\n");

if (!fam_ops)
goto err_code;
--
2.7.3