Re: [PATCH v6 2/2] acpi: apei: Do not panic() on PCIe errors reported through GHES

From: Tyler Baicar
Date: Mon May 21 2018 - 09:33:10 EST

Next message: Miquel Raynal: "Re: [PATCH v2 02/14] mtd: rawnand: denali: use helper function for ecc setup"
Previous message: Quentin Perret: "[RFC PATCH v3 02/10] sched/cpufreq: Factor out utilization to frequency mapping"
In reply to: Alexandru Gagniuc: "[PATCH v6 2/2] acpi: apei: Do not panic() on PCIe errors reported through GHES"
Next in thread: Alex G.: "Re: [PATCH v6 2/2] acpi: apei: Do not panic() on PCIe errors reported through GHES"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

On 5/21/2018 9:49 AM, Alexandru Gagniuc wrote:

+/* PCIe errors should not cause a panic. */
+static int ghes_sec_pcie_severity(struct acpi_hest_generic_data *gdata)
+{
+ struct cper_sec_pcie *pcie_err = acpi_hest_get_payload(gdata);
+
+ if (pcie_err->validation_bits & CPER_PCIE_VALID_DEVICE_ID &&
+ pcie_err->validation_bits & CPER_PCIE_VALID_AER_INFO &&
+ IS_ENABLED(CONFIG_ACPI_APEI_PCIEAER))
+ return GHES_SEV_RECOVERABLE;
+
+ return ghes_cper_severity(gdata->error_severity);
+}
+
+/*
+ * The severity field in the status block is an unreliable metric for the
+ * severity. A more reliable way is to look at each subsection and see how safe
+ * it is to call the approproate error handler.
+ * We're not conerned with handling the error. We're concerned with being able
+ * to notify an error handler by crossing the NMI/IRQ boundary, being able to
+ * schedule_work, and so forth.
+ * - SEC_PCIE: All PCIe errors can be handled by AER.
+ */
+static int ghes_severity(struct ghes *ghes)
+{
+ int worst_sev, sec_sev;
+ struct acpi_hest_generic_data *gdata;
+ const guid_t *section_type;
+ const struct acpi_hest_generic_status *estatus = ghes->estatus;
+
+ worst_sev = GHES_SEV_NO;
+ apei_estatus_for_each_section(estatus, gdata) {
+ section_type = (guid_t *)gdata->section_type;
+ sec_sev = ghes_cper_severity(gdata->error_severity);
+
+ if (guid_equal(section_type, &CPER_SEC_PCIE))
+ sec_sev = ghes_sec_pcie_severity(gdata);
+
+ worst_sev = max(worst_sev, sec_sev);
+ }
+
+ return worst_sev;
+}
+
static void ghes_do_proc(struct ghes *ghes,
const struct acpi_hest_generic_status *estatus)
{
@@ -944,7 +986,7 @@ static int ghes_notify_nmi(unsigned int cmd, struct pt_regs *regs)
ret = NMI_HANDLED;
}
- sev = ghes_cper_severity(ghes->estatus->error_severity);
+ sev = ghes_severity(ghes);

Hello Alex,

There is a compile warning if CONFIG_HAVE_ACPI_APEI_NMI is not selected.

Â CCÂÂÂÂÂ drivers/acpi/apei/ghes.o
drivers/acpi/apei/ghes.c:483:12: warning: âghes_severityâ defined but not used [-Wunused-function]
Âstatic int ghes_severity(struct ghes *ghes)
ÂÂÂÂÂÂÂÂÂÂÂ ^~~~~~~~~~~~~

Thanks,
Tyler

--
Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm Technologies, Inc.
Qualcomm Technologies, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project.

Next message: Miquel Raynal: "Re: [PATCH v2 02/14] mtd: rawnand: denali: use helper function for ecc setup"
Previous message: Quentin Perret: "[RFC PATCH v3 02/10] sched/cpufreq: Factor out utilization to frequency mapping"
In reply to: Alexandru Gagniuc: "[PATCH v6 2/2] acpi: apei: Do not panic() on PCIe errors reported through GHES"
Next in thread: Alex G.: "Re: [PATCH v6 2/2] acpi: apei: Do not panic() on PCIe errors reported through GHES"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]