Re: [PATCH] ACPI/APEI: Clear GHES block_status before panic()

From: Borislav Petkov
Date: Fri Dec 21 2018 - 13:59:46 EST


On Fri, Dec 21, 2018 at 06:52:20PM +0000, James Morse wrote:
> Do we need to ghes_ack_error() too?

That's GHES v2 AFAICT.

> With the location cleared the new kernel will never find the records, and
> firmware can never re-use that location because it wasn't ack'd. The upshot is
> RAS records can't be generated for the kdump kernel. The acpi spec talks about
> use of the memory, so I don't think its fair for it to use this to disarm a
> watchdog.
>
> I think we can live with this as the kdump kernel isn't going to handle RAS
> errors for the bulk of memory anyway.

Usually, handling hw errors is always better than not but the second
kernel can't do anything better in that respect than the first, right?
If it panics, it panics - no matter the kernel. Generally.

Therefore I think the role of the second kernel should be to be as
resilient as possible to hw errors - like, not even see them :-) - dump
the memory of the first kernel as quickly as possible and reboot for
analysis.

IMHO, of course.

--
Regards/Gruss,
Boris.

Good mailing practices for 400: avoid top-posting and trim the reply.