Re: [RFC PATCH 7/8] EDAC, mce_amd: Add a simple tracepoint dumping a decoded string

From: Steven Rostedt
Date: Thu Jul 27 2017 - 21:48:09 EST


On Tue, 25 Jul 2017 17:46:00 +0200
Borislav Petkov <bp@xxxxxxxxx> wrote:

> From: Borislav Petkov <bp@xxxxxxx>
>
> It is a single string which gets dynamically generated when the error
> gets decoded. Dump it to userspace through that tracepoint so that
> consumers can get the already decoded string and the kernel has the
> decoding functionality too, even if there are no userspace consumers.
>
> Signed-off-by: Borislav Petkov <bp@xxxxxxx>
> Cc: Steven Rostedt <rostedt@xxxxxxxxxxx>

Acked-by: Steven Rostedt (VMware) <rostedt@xxxxxxxxxxx>

-- Steve

> ---
> drivers/edac/mce_amd.c | 7 ++++++-
> drivers/ras/ras.c | 1 +
> include/ras/ras_event.h | 16 ++++++++++++++++
> 3 files changed, 23 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/edac/mce_amd.c b/drivers/edac/mce_amd.c
> index 1f5e9bb161f3..ce7e20ca6773 100644
> --- a/drivers/edac/mce_amd.c
> +++ b/drivers/edac/mce_amd.c
> @@ -1,6 +1,8 @@
> #include <linux/seq_buf.h>
> #include <linux/module.h>
> #include <linux/slab.h>
> +#include <linux/ras.h>
> +#include <ras/ras_event.h>
>
> #include <asm/cpu.h>
>
> @@ -1053,7 +1055,10 @@ amd_decode_mce(struct notifier_block *nb, unsigned long val, void *data)
> err_code:
> amd_decode_err_code(m->status & 0xffff);
>
> - pr_emerg("%.*s\n", (int)sb.len, sb.buffer);
> + if (ras_userspace_consumers())
> + trace_mce_decode(sb.buffer);
> + else
> + pr_emerg("%.*s\n", (int)sb.len, sb.buffer);
>
> seq_buf_clear_buf(&sb);
>
> diff --git a/drivers/ras/ras.c b/drivers/ras/ras.c
> index 5429d3795732..a09e39b3f711 100644
> --- a/drivers/ras/ras.c
> +++ b/drivers/ras/ras.c
> @@ -42,6 +42,7 @@ EXPORT_TRACEPOINT_SYMBOL_GPL(extlog_mem_event);
> EXPORT_TRACEPOINT_SYMBOL_GPL(mc_event);
> EXPORT_TRACEPOINT_SYMBOL_GPL(non_standard_event);
> EXPORT_TRACEPOINT_SYMBOL_GPL(arm_event);
> +EXPORT_TRACEPOINT_SYMBOL_GPL(mce_decode);
>
> static int __init parse_ras_param(char *str)
> {
> diff --git a/include/ras/ras_event.h b/include/ras/ras_event.h
> index 429f46fb61e4..113df73f7ba0 100644
> --- a/include/ras/ras_event.h
> +++ b/include/ras/ras_event.h
> @@ -407,6 +407,22 @@ TRACE_EVENT(memory_failure_event,
> )
> );
> #endif /* CONFIG_MEMORY_FAILURE */
> +
> +TRACE_EVENT(mce_decode,
> + TP_PROTO(const char *param_str),
> +
> + TP_ARGS(param_str),
> +
> + TP_STRUCT__entry(
> + __string(str, param_str)
> + ),
> +
> + TP_fast_assign(
> + __assign_str(str, param_str);
> + ),
> +
> + TP_printk("%s", __get_str(str))
> +);
> #endif /* _TRACE_HW_EVENT_MC_H */
>
> /* This part must be outside protection */