Re: [PATCH 4/9] x86/mce: Move machine_check_poll() status checks to helper functions

From: Borislav Petkov
Date: Mon Jun 03 2024 - 13:37:53 EST


On Thu, May 23, 2024 at 10:56:36AM -0500, Yazen Ghannam wrote:
> @@ -709,48 +747,9 @@ void machine_check_poll(enum mcp_flags flags, mce_banks_t *b)
> if (!mca_cfg.cmci_disabled)
> mce_track_storm(&m);
>
> - /* If this entry is not valid, ignore it */
> - if (!(m.status & MCI_STATUS_VAL))
> + if (!log_poll_error(flags, &m))
> continue;
>
> - /*
> - * If we are logging everything (at CPU online) or this
> - * is a corrected error, then we must log it.
> - */
> - if ((flags & MCP_UC) || !(m.status & MCI_STATUS_UC))
> - goto log_it;
> -
> - /*
> - * Newer Intel systems that support software error
> - * recovery need to make additional checks. Other
> - * CPUs should skip over uncorrected errors, but log
> - * everything else.
> - */

You lost that comment.

> - if (!mca_cfg.ser) {
> - if (m.status & MCI_STATUS_UC)
> - continue;
> - goto log_it;
> - }
> -
> - /* Log "not enabled" (speculative) errors */
> - if (!(m.status & MCI_STATUS_EN))
> - goto log_it;
> -
> - /*
> - * Log UCNA (SDM: 15.6.3 "UCR Error Classification")
> - * UC == 1 && PCC == 0 && S == 0
> - */
> - if (!(m.status & MCI_STATUS_PCC) && !(m.status & MCI_STATUS_S))
> - goto log_it;
> -
> - /*
> - * Skip anything else. Presumption is that our read of this
> - * bank is racing with a machine check. Leave the log alone
> - * for do_machine_check() to deal with it.
> - */
> - continue;
> -
> -log_it:
> if (flags & MCP_DONTLOG)
> goto clear_it;

Btw, the code looks really weird now:

if (!log_poll_error(flags, &m))
continue;

if (flags & MCP_DONTLOG)
goto clear_it;

i.e.,

1. Should I log it?

2. Should I not log it?

Oh well, it was like that before logically so...

--
Regards/Gruss,
Boris.

https://people.kernel.org/tglx/notes-about-netiquette