RE: [PATCH v2 1/2] x86/mce/AMD: Redo use of SMCA MCA_DE{STAT,ADDR} registers

From: Ghannam, Yazen
Date: Wed Apr 05 2017 - 15:30:02 EST


> -----Original Message-----
> From: Borislav Petkov [mailto:bp@xxxxxxxxx]
> Sent: Wednesday, April 05, 2017 2:22 PM
> To: Ghannam, Yazen <Yazen.Ghannam@xxxxxxx>
> Cc: linux-edac@xxxxxxxxxxxxxxx; Tony Luck <tony.luck@xxxxxxxxx>;
> x86@xxxxxxxxxx; linux-kernel@xxxxxxxxxxxxxxx
> Subject: Re: [PATCH v2 1/2] x86/mce/AMD: Redo use of SMCA
> MCA_DE{STAT,ADDR} registers
>
>
> > How does log_error() know if we can't use the normal MSRs?
>
> MCI_STATUS_VAL.
>
> > We check for MCI_STATUS_VAL in log_error().
>
> Yes.
>
> > We also need to check for MCI_STATUS_DEFERRED but only if we're coming
> > from the deferred error handler.
>
> Why? We *are* coming from the #DF handler so are you expecting a
> different type of error in the MSRs?
>

Yes, there could be depending on how MCA_CONFIG[LogDeferredInMcaStat] is
set among other things.

If it's set, then I expect a Deferred error in MCA_STATUS since any Correctable
Errors will be overwritten. Multiple bank types can generate Deferred errors
so there may also be cases where for some types a valid Uncorrectable error
happens and overwrites the Deferred error before we can handle it. In which
case we lose the Deferred error if we don't check MCA_DESTAT.

If it's not set, then it's possible to have a valid Correctable error in MCA_STATUS
while the valid Deferred error is in MCA_DESTAT.

Right now MCA_CONFIG[LogDeferredInMcaStat] is set but this may change for
future SMCA implementations.

Thanks,
Yazen