Re: [PATCH v2 1/2] mce: acpi/apei: Honour Firmware First for MCAbanks listed in APEI HEST CMC

From: Naveen N. Rao
Date: Thu Jun 20 2013 - 17:22:43 EST

On 06/21/2013 02:27 AM, Borislav Petkov wrote:
On Fri, Jun 21, 2013 at 01:44:00AM +0530, Naveen N. Rao wrote:
This won't work across cpu offline/online, right? We will end up
_not_ enabling CMCI on certain banks where we should have.

Huh, don't understand. cmci_discover runs on each CPU. After you've run
hest_parse_cmc early during boot and cleared the mce_poll_banks bits,
nothing will set them again so CPU hotplug doesn't matter...

Exactly, but mce_poll_banks also doesn't have bits set for banks on which CMCI is enabled.

Let's say we have a cpu with 2 banks (not shared), none of which work in FF mode. Both these banks support CMCI, so mce_poll_banks won't have these bits set.

On cpu offline, we call cmci_clear() which disables CMCI on these two banks before offlining it. When this cpu is brought online again, we call cmci_discover() which sees that mce_poll_banks doesn't have these two banks enabled and will skip enabling CMCI thinking these are in FF.


Another thing: for hest_parse_cmc(), does the below look good?

cmc = (struct acpi_hest_ia_corrected *)hest_hdr;
if (!cmc->enabled)
return 0;

* We expect HEST to provide a list of MC banks that
* report errors in firmware first mode.
if (!(cmc->flags & ACPI_HEST_FIRMWARE_FIRST) ||

The return value doesn't really matter since we don't check it, but
returning an error looked like the wrong thing to do as well.

I'd add a comment above the "return 1" statement to explain why I'm
doing this. It is much more verbose even than a well-named macro :)

Okay :)


