RE: [PATCH] x86/mce: Always save severity in machine_check_poll

From: Ghannam, Yazen
Date: Fri Jun 16 2017 - 10:50:04 EST


> -----Original Message-----
> From: linux-kernel-owner@xxxxxxxxxxxxxxx [mailto:linux-kernel-
> owner@xxxxxxxxxxxxxxx] On Behalf Of Borislav Petkov
> Sent: Wednesday, June 14, 2017 10:21 AM
> To: Ghannam, Yazen <Yazen.Ghannam@xxxxxxx>
> Cc: linux-edac@xxxxxxxxxxxxxxx; Tony Luck <tony.luck@xxxxxxxxx>;
> x86@xxxxxxxxxx; linux-kernel@xxxxxxxxxxxxxxx
> Subject: Re: [PATCH] x86/mce: Always save severity in machine_check_poll
>
> On Mon, Jun 12, 2017 at 11:54:06AM -0500, Yazen Ghannam wrote:
> > From: Yazen Ghannam <yazen.ghannam@xxxxxxx>
> >
> > Remove code that was used to decide whether to schedule work. The
> > decision
>
> ???
>
> I'm missing a *lot* of background in order to understand what that sentence
> means.
>

The code block being removed here was added in the following commit to decide
whether or not to schedule work.

fa92c58 x86, mce: Support memory error recovery for both UCNA and Deferred error in machine_check_poll

The following commit based the decision to schedule work on if we have a usable
address and made this decision later in machine_check_poll().

8b38937b x86/mce: Do not enter deferred errors into the generic pool twice

Then the following commit removed m.usable_addr from the code block.

c0ec382 x86/RAS: Remove mce.usable_addr

So now this code block just decides whether or not to save the severity. We can
remove this block, since the original purpose of this code (to schedule work) is no
longer happening.

Tony has a concern that some notifiers may assume that the severity being
set means that the error is a memory error. As far as I can tell, the only notifier
that uses severity is the SRAO notifier and it doesn't make an assumption.

We schedule work if we want to log the error or if we have a usable address.
So there's no reason not to save the severity anymore.

Thanks,
Yazen