Re: [PATCH] x86/mce: Cover grading of AMD machine error checks

From: Carlos Bilbao
Date: Fri Mar 11 2022 - 09:07:35 EST


On 3/10/2022 4:13 PM, Borislav Petkov wrote:
> On Thu, Mar 10, 2022 at 12:24:08PM -0600, Carlos Bilbao wrote:
>> We will cover grading of MCEs like deferred memory scrub errors, attempts
>> to access poisonous data, etc. I could list all new covered cases in the
>> commit message if you think that'd be positive.
>
> So no actual use case - you want to grade error severity for all types
> of MCEs.
>
>> Hope that helps clarify,
>
> Yes, it does a bit.
>
> It sounds to me like you want to do at least two patches:
>
> 1. Extend the severity grading function with the new types of errors
>
> 2. Add string descriptions of the error types mce_severity_amd() looks
> at, so that mce_panic() issues them.
>
> I.e., you want to decode the fatal MCEs which panic the machine.
>
> In general, what would help is if you think about what you're trying to
> achieve and write it down first. How to achieve that we can figure out
> later.
>
> What happens now is you send me a patch and I'm trying to decipher from
> the code why you're doing what you're doing. Which is kinda backwards if
> you think about it...
>

Glad we are on the same page now. I will prepare a pachset and include more
informative commit messages.

Thanks,
Carlos