Re: [PATCH v3] EDAC/ghes: Setup DIMM label from DMI and use it in error reports

From: Borislav Petkov
Date: Tue May 19 2020 - 16:25:44 EST


On Mon, May 18, 2020 at 11:58:52AM +0200, Robert Richter wrote:
> +static void dimm_setup_label(struct dimm_info *dimm, u16 handle)
> +{
> + const char *bank = NULL, *device = NULL;
> +
> + dmi_memdev_name(handle, &bank, &device);
> +
> + /* both strings must be non-zero */
> + if (bank && *bank && device && *device)
> + snprintf(dimm->label, sizeof(dimm->label),
> + "%s %s", bank, device);
> + else
> + snprintf(dimm->label, sizeof(dimm->label),
> + "unknown memory (handle: 0x%.4x)", handle);

This changes the sysfs strings on my test box like this. 00-ghes.before
and 01-ghes.after are created by doing:

grep -EriIn . /sys/devices/system/edac/ 2>/dev/null > [filename]

edac_mc_alloc_dimms() already sets the dimm->label to "mc#%dmemory#%d"
but I'm guessing that dmi_memdev_name() doesn't give on my machine what
it gives on yours.

Welcome to the wonderful world of consistently implemented firmware!

--- 00-ghes.before 2020-05-19 17:55:50.821220239 +0200
+++ 01-ghes.after 2020-05-19 22:09:28.808492701 +0200
@@ -17,7 +17,7 @@
/sys/devices/system/edac/mc/mc0/ce_count:1:0
/sys/devices/system/edac/mc/mc0/mc_name:1:ghes_edac
/sys/devices/system/edac/mc/mc0/csrow15/ce_count:1:0
-/sys/devices/system/edac/mc/mc0/csrow15/ch0_dimm_label:1:mc#0memory#15
+/sys/devices/system/edac/mc/mc0/csrow15/ch0_dimm_label:1:unknown memory (handle: 0x0030)
/sys/devices/system/edac/mc/mc0/csrow15/power/runtime_active_time:1:0
/sys/devices/system/edac/mc/mc0/csrow15/power/runtime_active_kids:1:0
/sys/devices/system/edac/mc/mc0/csrow15/power/runtime_usage:1:0
@@ -42,7 +42,7 @@
/sys/devices/system/edac/mc/mc0/power/runtime_enabled:1:disabled & forbidden
/sys/devices/system/edac/mc/mc0/power/control:1:on
/sys/devices/system/edac/mc/mc0/csrow31/ce_count:1:0
-/sys/devices/system/edac/mc/mc0/csrow31/ch0_dimm_label:1:mc#0memory#31
+/sys/devices/system/edac/mc/mc0/csrow31/ch0_dimm_label:1:unknown memory (handle: 0x0040)
/sys/devices/system/edac/mc/mc0/csrow31/power/runtime_active_time:1:0
/sys/devices/system/edac/mc/mc0/csrow31/power/runtime_active_kids:1:0
/sys/devices/system/edac/mc/mc0/csrow31/power/runtime_usage:1:0
@@ -73,10 +73,10 @@
/sys/devices/system/edac/mc/mc0/dimm15/dimm_dev_type:1:Unknown
/sys/devices/system/edac/mc/mc0/dimm15/size:1:32768
/sys/devices/system/edac/mc/mc0/dimm15/dimm_ce_count:1:0
-/sys/devices/system/edac/mc/mc0/dimm15/dimm_label:1:mc#0memory#15
+/sys/devices/system/edac/mc/mc0/dimm15/dimm_label:1:unknown memory (handle: 0x0030)
/sys/devices/system/edac/mc/mc0/dimm15/dimm_location:1:memory 15
/sys/devices/system/edac/mc/mc0/dimm15/dimm_edac_mode:1:SECDED
-/sys/devices/system/edac/mc/mc0/seconds_since_reset:1:354
+/sys/devices/system/edac/mc/mc0/seconds_since_reset:1:979
/sys/devices/system/edac/mc/mc0/dimm31/dimm_ue_count:1:0
/sys/devices/system/edac/mc/mc0/dimm31/dimm_mem_type:1:Registered-DDR4
/sys/devices/system/edac/mc/mc0/dimm31/power/runtime_active_time:1:0
@@ -90,7 +90,7 @@
/sys/devices/system/edac/mc/mc0/dimm31/dimm_dev_type:1:Unknown
/sys/devices/system/edac/mc/mc0/dimm31/size:1:32768
/sys/devices/system/edac/mc/mc0/dimm31/dimm_ce_count:1:0
-/sys/devices/system/edac/mc/mc0/dimm31/dimm_label:1:mc#0memory#31
+/sys/devices/system/edac/mc/mc0/dimm31/dimm_label:1:unknown memory (handle: 0x0040)
/sys/devices/system/edac/mc/mc0/dimm31/dimm_location:1:memory 31
/sys/devices/system/edac/mc/mc0/dimm31/dimm_edac_mode:1:SECDED
/sys/devices/system/edac/mc/mc0/max_location:1:memory 31

--
Regards/Gruss,
Boris.

https://people.kernel.org/tglx/notes-about-netiquette