Re: [RFC PATCH v4 3/3] acpi: apei: Do not panic() on PCIe errors reported through GHES

From: Borislav Petkov
Date: Fri May 11 2018 - 12:30:21 EST


On Fri, May 11, 2018 at 11:12:25AM -0500, Alex G. wrote:
> > I think *you* didn't get it: IS_ENABLED(CONFIG_ACPI_APEI_PCIEAER) is not
> > enough of a check to confirm that there actually *is* an AER driver to
> > handle the errors. If you really want to make sure the driver is loaded
> > and functioning, then you need an explicit registering mechanism or some
> > other way of checking it really is there and handling errors.
>
> config ACPI_APEI_PCIEAER
> bool "APEI PCIe AER logging/recovering support"
> depends on ACPI_APEI && PCIEAER
> help
> PCIe AER errors may be reported via APEI firmware first mode.
> Turn on this option to enable the corresponding support.
>
> PCIAER is not modularizable. QED

QED my ass.

Read the f*ck my email again: the presence of the *code* is
not enough of a check to confirm the error has been handled.
aer_recover_work_func() can fail as that kfifo_put() in
aer_recover_queue() can too.

You need an *actual* confirmation that the error has been handled
properly and *only* *then* not panic the system. Otherwise you are
potentially leaving those errors unhandled.

--
Regards/Gruss,
Boris.

Good mailing practices for 400: avoid top-posting and trim the reply.