[PATCH 0/6] x86/RAS: Correctable Errors Collector
From: Borislav Petkov
Date: Mon Mar 27 2017 - 05:34:17 EST
From: Borislav Petkov <bp@xxxxxxx>
Hi guys,
here's v1, all feedback I know of has been addressed. So I guess it is
time. :)
We don't have it default y yet but will make it so after it has seen
wider testing. The end goal is to have it running by default so that
transient correctable ECC errors don't generate error logs and upset
people unnecessarily.
The other good thing resulting from this patchset is that we have *all*
MCE consumers lined up in a notifier with priorities. This way we have a
single chain which gets to see error records and not some wild variety
of hooks here and there.
Last but not least, /dev/mcelog has been deprecated and all the code is
behind a CONFIG_X86_MCELOG config option.
Btw, patch 1 is for urgent.
Please apply,
thanks.
Changelog:
=========
v0:
here's the latest incarnation of the CEC collector. I think I've taken
care of all review comments but feel free to correct me here. The
introductory comment in cec.c should explain the whole deal - I'm
referring to there so that we have that text in the actual source and
not spread it around commit messages. So pls have a look there for more
info.
The thing has knobs in debugfs now which can control its operation, I
hope I've chosen sane default values.
Andi Kleen (1):
x86/mce: Don't print MCEs when mcelog is active
Borislav Petkov (4):
x86/MCE: Rename mce_log()'s argument
x86/MCE: Rename mce_log to mce_log_buffer
RAS: Add a Corrected Errors Collector
x86/mce: Do not register notifiers with invalid prio
Tony Luck (1):
x86/mce: Deprecate /dev/mcelog
Documentation/admin-guide/kernel-parameters.txt | 6 +
arch/x86/Kconfig | 10 +-
arch/x86/include/asm/mce.h | 12 +-
arch/x86/kernel/cpu/mcheck/Makefile | 2 +
arch/x86/kernel/cpu/mcheck/dev-mcelog.c | 397 ++++++++++++++++++
arch/x86/kernel/cpu/mcheck/mce-internal.h | 8 +
arch/x86/kernel/cpu/mcheck/mce.c | 501 +++++-----------------
arch/x86/ras/Kconfig | 14 +
drivers/ras/Makefile | 3 +-
drivers/ras/cec.c | 532 ++++++++++++++++++++++++
drivers/ras/debugfs.c | 2 +-
drivers/ras/debugfs.h | 8 +
drivers/ras/ras.c | 11 +
include/linux/ras.h | 13 +-
14 files changed, 1105 insertions(+), 414 deletions(-)
create mode 100644 arch/x86/kernel/cpu/mcheck/dev-mcelog.c
create mode 100644 drivers/ras/cec.c
create mode 100644 drivers/ras/debugfs.h
--
2.11.0