Re: [PATCH V3 2/2] acpi: apei: call into AER handling regardless of severity

From: Tyler Baicar
Date: Mon Nov 13 2017 - 10:35:07 EST


On 11/13/2017 7:36 AM, Dongdong Liu wrote:

å 2017/11/9 3:13, Tyler Baicar åé:
Currently the GHES code only calls into the AER driver for
recoverable type errors. This is incorrect because errors of
other severities do not get logged by the AER driver and do not
get exposed to user space via the AER trace event. So, call
into the AER driver for PCIe errors regardless of the severity

It will also call do_recovery() regardless of the severity for AER correctable errors.
Correctable errors include those error conditions where hardware can recover without any loss of information.
Hardware corrects these errors and software intervention is not required.
So we'd better modify the code as below.
diff --git a/drivers/pci/pcie/aer/aerdrv_core.c b/drivers/pci/pcie/aer/aerdrv_core.c
index 7448052..a7f77549 100644
--- a/drivers/pci/pcie/aer/aerdrv_core.c
+++ b/drivers/pci/pcie/aer/aerdrv_core.c
@@ -633,7 +633,8 @@ static void aer_recover_work_func(struct work_struct *work)
ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ continue;
ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ }
ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ cper_print_aer(pdev, entry.severity, entry.regs);
-ÂÂÂÂÂÂÂÂÂÂ do_recovery(pdev, entry.severity);
+ÂÂÂÂÂÂÂÂ if(entry.severity != AER_CORRECTABLE)
+ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ do_recovery(pdev, entry.severity);
ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ pci_dev_put(pdev);
ÂÂÂÂÂÂÂ }
Â}
Hello Dongdong,

Yes, I have a patch for this that needs to be picked up.

https://lkml.org/lkml/2017/8/28/848

Thanks,
Tyler

--
Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm Technologies, Inc.
Qualcomm Technologies, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project.