Re: [PATCH 4/5] x86/mce: Fix all mce notifiers to update the mce->handled bitmask

From: Borislav Petkov
Date: Thu Feb 13 2020 - 12:03:17 EST


On Wed, Feb 12, 2020 at 12:46:51PM -0800, Tony Luck wrote:
> If the handler took any action to log or deal with the error, set
> a bit int mce->handled so that the default handler on the end of
> the machine check chain can see what has been done.
>
> [!!! What to do about NOTIFY_STOP ... any handler that returns this
> value short-circuits calling subsequent entries on the chain. In
> some cases this may be the right thing to do ... but it others we
> really want to keep calling other functions on the chain]

Yes, we can kill that NOTIFY_STOP thing in the mce code since it is
nasty.

Then, from the looks of it, there should be a function at the end of
the chain which decides whether to print or not, just by looking at
->handled.

For example, it should not print MCE_HANDLED_CEC or MCE_HANDLED_EDAC,
etc cases. The assumption for the latter being that EDAC does its own
printing. Which then makes me wonder whether MCE_HANDLED_EDAC is enough.

Because this one bit would basically determine whether the error gets
printed or not. Which would mean that all EDAC drivers should print
it...

All I'm saying is, we should think about modalities like that.

Thx.

--
Regards/Gruss,
Boris.

https://people.kernel.org/tglx/notes-about-netiquette