[RFC 0/5] riscv: initial support for Generic Hardware Error Source (GHES)
From: Rui Qi
Date: Thu Feb 06 2025 - 08:19:48 EST
From: Rui Qi <qirui.001@xxxxxxxxxxxxx>
NOTE: Before compiling the kernel, enable ACPI, APEI and GHES in menuconfig
The following options must be enabled in .config file.
CONFIG_HAVE_ACPI_APEI=y
CONFIG_ACPI_APEI=y
CONFIG_ACPI_APEI_GHES=y
Through fault injection, we can see the following example output
[Hardware Error]: Hardware error from APEI Generic Hardware Error Source: 0
[Hardware Error]: event severity info
[Hardware Error]: Error 0, type: corrected
[Hardware Error]: section_type: memory error
[Hardware Error]: error_status: Storage error in DRAM memory (0x018d8480019304f0)
[Hardware Error]: node:0 card:0 module:0 rank:0 bank:0 device:0 row:0 column:0
[Hardware Error]: error_type: 2, single-bit ECC
[Hardware Error]: Error 1, type: corrected
[Hardware Error]: section_type: Flle error
[Hardware Error]: port_type: 4, root port
[Hardware Error]: version: 3.0
[Hardware Error]: command: 0x0146, status: 0x0011
[Hardware Error]: device_id: 0000:00:00,0
[Hardware Error]: slot: 0
[Hardware Error]: secondary_bus: 0x01
[Hardware Error]: vendor_id: 0x1e93, device_id: 0x1010
[Hardware Error]: class_code: 060400
[Hardware Error]: bridge: secondary_status: 0x0000, control: 0x0003
[Hardware Error]: aer_cor_status: 0x00001000, aer_cor_mask: 0x0000000
[Hardware Error]: aer_uncor_status: 0x00000000, aer_uncor_mask: 0x04000000
[Hardware Error]: aer_uncor_severity: 0x004e3030
[Hardware Error]: TLP Header: 000000000 000000000 0000000000000000
Rui Qi (5):
riscv: select HAVE_ACPI_APEI
efi: add riscv APEI generic processor error printing support
riscv: add fix map index for GHES IRQ
RISC-V: ACPI: define arch_apei_get_mem_attribute
RISC-V: define ioremap_cache
arch/riscv/Kconfig | 1 +
arch/riscv/include/asm/acpi.h | 18 ++++++++++++++++++
arch/riscv/include/asm/fixmap.h | 3 +++
arch/riscv/include/asm/io.h | 5 +++++
drivers/firmware/efi/cper.c | 4 ++++
5 files changed, 31 insertions(+)
--
2.20.1