RE: [PATCH] x86, mce: Fix machine_check_poll() tests for which errors to log

From: Ghannam, Yazen
Date: Mon Mar 11 2019 - 18:10:34 EST


> -----Original Message-----
> From: Luck, Tony <tony.luck@xxxxxxxxx>
> Sent: Monday, March 11, 2019 3:42 PM
> To: Ghannam, Yazen <Yazen.Ghannam@xxxxxxx>
> Cc: Borislav Petkov <bp@xxxxxxxxx>; x86@xxxxxxxxxx; linux-kernel@xxxxxxxxxxxxxxx; Ashok Raj <ashok.raj@xxxxxxxxx>
> Subject: Re: [PATCH] x86, mce: Fix machine_check_poll() tests for which errors to log
>
> On Mon, Mar 11, 2019 at 08:25:53PM +0000, Ghannam, Yazen wrote:
> > > + if (!(m.status & MCI_STATUS_PCC) && !(m.status & MCI_STATUS_S))
> > > + goto log_it;
> > > +
> >
> > Can you please include a vendor check with this? MCi_STATUS[56] is
> > not defined the same way on AMD systems.
>
> Original code also looked at MCi_STATUS[56] without a vendor
> check:
>
> > > - (m.status & (mca_cfg.ser ? MCI_STATUS_S : MCI_STATUS_UC)))
>
> Was this OK because you don't set mca_cfg.ser?
>
> If so, my new code will also skip out before getting to this test. But
> should probably have a better comment. Something like:
>
>
> /*
> * Newer Intel systems that support software error
> * recovery need to make some extra checks. Other
> * CPUs should skip over uncorrected errors, but log
> * everything else
> */
> if (!mca_cfg.ser) {
> if (m.status & MCI_STATUS_UC)
> continue;
> goto log_it;
> }
>

Yes, you're right. Thanks for pointing that out.

-Yazen