Re: [PATCH 0/7] EDAC, mce_amd: Issue decoded MCE through the tracepoint

From: Borislav Petkov
Date: Mon Aug 28 2017 - 09:46:10 EST


On Fri, Aug 25, 2017 at 12:24:04PM +0200, Borislav Petkov wrote:
> Next step is adding that to rasdaemon.

Ok, below is the dirty version of the changes that need to go into
rasdaemon, I'll clean that up later. With it, I get:

# ./rasdaemon -f
overriding event (541) ras:mc_event with new print handler
rasdaemon: ras:mc_event event enabled
rasdaemon: Enabled event ras:mc_event
overriding event (58) mce:mce_record with new print handler
rasdaemon: mce:mce_record event enabled
rasdaemon: Enabled event mce:mce_record
overriding event (542) ras:extlog_mem_event with new print handler
rasdaemon: ras:extlog_mem_event event enabled
rasdaemon: Enabled event ras:extlog_mem_event
rasdaemon: Listening to events for cpus 0 to 7
cpu 07: <...>-104 [1433913776] 0.000006: mce_record: 2017-08-28 17:41:01 +0200 bank=4, status= 9c7d410092080813, MC4 Error (node 2): DRAM ECC error detected on the NB.
, cpu_type= generic CPU, cpu= 2, socketid= 0, misc= 0, addr= 6d3d483b, , apicid= 0

and looking at it now, I don't need that "MC%d Error..:" thing either.

All queued for the next version.

---
diff --git a/ras-mce-handler.c b/ras-mce-handler.c
index 2e520d3663ac..ff6f4b373e56 100644
--- a/ras-mce-handler.c
+++ b/ras-mce-handler.c
@@ -23,6 +23,7 @@
#include <unistd.h>
#include <stdint.h>
#include "libtrace/kbuffer.h"
+#include "libtrace/event-utils.h"
#include "ras-mce-handler.h"
#include "ras-record.h"
#include "ras-logger.h"
@@ -185,6 +186,10 @@ static int detect_cpu(struct ras_events *ras)
ret = 0;

if (!strcmp(mce->vendor, "AuthenticAMD")) {
+
+ ret = 0;
+ goto ret;
+
if (mce->family == 15)
mce->cputype = CPU_K8;
if (mce->family > 15) {
@@ -357,8 +362,9 @@ int ras_mce_event_handler(struct trace_seq *s,
unsigned long long val;
struct ras_events *ras = context;
struct mce_priv *mce = ras->mce_priv;
+ const char *decoded_mce;
struct mce_event e;
- int rc = 0;
+ int rc = 0, len;

memset(&e, 0, sizeof(e));

@@ -422,6 +428,10 @@ int ras_mce_event_handler(struct trace_seq *s,
if (rc)
return rc;

+ decoded_mce = pevent_get_field_raw(s, event, "decoded_str", record, &len, 1);
+ if (decoded_mce)
+ strncpy(e.error_msg, decoded_mce, min(len, 4096));
+
if (!*e.error_msg && *e.mcastatus_msg)
mce_snprintf(e.error_msg, "%s", e.mcastatus_msg);

--
Regards/Gruss,
Boris.

Good mailing practices for 400: avoid top-posting and trim the reply.