Re: [PATCH] acpi: apei: call into AER handling regardless of severity

From: Borislav Petkov
Date: Tue Aug 29 2017 - 18:19:56 EST


On Tue, Aug 29, 2017 at 03:27:42PM -0600, Baicar, Tyler wrote:
> To avoid calling the
> do_recovery() function for correctable errors I created
> https://patchwork.kernel.org/patch/9925877/

enum {
GHES_SEV_NO = 0x0,
GHES_SEV_CORRECTED = 0x1,
GHES_SEV_RECOVERABLE = 0x2,
GHES_SEV_PANIC = 0x3,
};

>From all those severity types above, you want to do recovery for
GHES_SEV_RECOVERABLE but print *all* severities. Yes? I mean, this is
what makes most sense: you want to dump all errors but try to recover
from those from which you *actually* have the possibility to do so.

Looking at the severities conversion, GHES_SEV_RECOVERABLE is
CPER_SEV_RECOVERABLE. cper_severity_to_aer() converts then
CPER_SEV_RECOVERABLE to AER_NONFATAL.

[ Btw, this is the dumbest sh*t ever. Three different severities!!!
Looks like someone has won a contest of how to design something as
needlessly complex as possible. ]

So it looks to me like you want to do rather:

if (entry.severity == AER_NONFATAL)
do_recovery(pdev, entry.severity);

which should correspond to the GHES_SEV_RECOVERABLE. And this would be
the straight-forward thing and that would be fine but...

... that is still not 100% equivalent because the check is:

if (sev == GHES_SEV_RECOVERABLE && sec_sev == GHES_SEV_RECOVERABLE...

so there's the severity of the estatus block and then the severity of
each section successively.

And I have no idea why we're doing this.

Because if we have to keep this, then the above simplification won't work and
you'll have to pass in a separate argument to aer_recover_queue():

aer_recover_queue( ..., sev == GHES_SEV_RECOVERABLE &&
sec_sev == GHES_SEV_RECOVERABLE, ...

which, if true, would mean, do recovery.

So let's find out first why do we have to look at both severities.

Tony, any ideas?

--
Regards/Gruss,
Boris.

SUSE Linux GmbH, GF: Felix ImendÃrffer, Jane Smithard, Graham Norton, HRB 21284 (AG NÃrnberg)
--