Re: [PATCH v2 1/2] apei/ghes: don't go past the ARM processor CPER record buffer
From: Shuai Xue
Date: Mon Dec 08 2025 - 21:43:06 EST
在 2025/11/28 18:53, Mauro Carvalho Chehab 写道:
There's a logic inside ghes/cper to detect if the section_length
is too small, but it doesn't detect if it is too big.
Currently, if the firmware receives an ARM processor CPER record
stating that a section length is big, kernel will blindly trust
section_lentgh, producing a very long dump. For instance, a 67
bytes record with ERR_INFO_NUM set 46198 and section length
set to 854918320 would dump a lot of data going a way past the
firmware memory-mapped area.
Fix it by adding a logic to prevent it to go past the buffer
if ERR_INFO_NUM is too big, making it report instead:
[Hardware Error]: Hardware error from APEI Generic Hardware Error Source: 1
[Hardware Error]: event severity: recoverable
[Hardware Error]: Error 0, type: recoverable
[Hardware Error]: section_type: ARM processor error
[Hardware Error]: MIDR: 0xff304b2f8476870a
[Hardware Error]: section length: 854918320, CPER size: 67
[Hardware Error]: section length is too big
[Hardware Error]: firmware-generated error record is incorrect
[Hardware Error]: ERR_INFO_NUM is 46198
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@xxxxxxxxxx>
---
drivers/acpi/apei/ghes.c | 13 +++++++++++++
drivers/firmware/efi/cper-arm.c | 14 +++++++++-----
drivers/firmware/efi/cper.c | 3 ++-
include/linux/cper.h | 3 ++-
4 files changed, 26 insertions(+), 7 deletions(-)
diff --git a/drivers/acpi/apei/ghes.c b/drivers/acpi/apei/ghes.c
index 56107aa00274..8b90b6f3e866 100644
--- a/drivers/acpi/apei/ghes.c
+++ b/drivers/acpi/apei/ghes.c
@@ -557,6 +557,7 @@ static bool ghes_handle_arm_hw_error(struct acpi_hest_generic_data *gdata,
{
struct cper_sec_proc_arm *err = acpi_hest_get_payload(gdata);
int flags = sync ? MF_ACTION_REQUIRED : 0;
+ int length = gdata->error_data_length;
char error_type[120];
bool queued = false;
int sec_sev, i;
@@ -568,7 +569,12 @@ static bool ghes_handle_arm_hw_error(struct acpi_hest_generic_data *gdata,
return false;
p = (char *)(err + 1);
+ length -= sizeof(err);
+
for (i = 0; i < err->err_info_num; i++) {
+ if (length <= 0)
+ break;
+
Hi, Mauro,
The bounds checking logic is duplicated - it appears both in the cache
error handling branch and after it. This could be simplified. It would
be better to ensure we have enough data for the error info header in one
check.
/* Ensure we have enough data for the error info header */
if (length < sizeof(struct cper_arm_err_info))
break;
And it would be better to validate the claimed length before using it.
/* Validate the claimed length before using it */
if (err_info->length < sizeof(struct cper_arm_err_info) ||
err_info->length > length)
break;
Thanks.
Shuai