Re: [PATCH 5/5] x86/mce: Change default mce logger to check mce->handled

From: Borislav Petkov
Date: Thu Feb 13 2020 - 12:08:27 EST


On Wed, Feb 12, 2020 at 12:46:52PM -0800, Tony Luck wrote:
> Instead of keeping count of how many handlers are registered on the
> mce chain and printing if we are below some magic value. Look at the
> mce->handled to see if anyone claims to have handled/logged this error.
>
> [debug to always print in this version]
>
> Signed-off-by: Tony Luck <tony.luck@xxxxxxxxx>
> ---
> arch/x86/kernel/cpu/mce/core.c | 20 ++++----------------
> 1 file changed, 4 insertions(+), 16 deletions(-)
>
> diff --git a/arch/x86/kernel/cpu/mce/core.c b/arch/x86/kernel/cpu/mce/core.c
> index ce7a78872f8f..5b73df383300 100644
> --- a/arch/x86/kernel/cpu/mce/core.c
> +++ b/arch/x86/kernel/cpu/mce/core.c
> @@ -156,29 +156,17 @@ void mce_log(struct mce *m)
> }
> EXPORT_SYMBOL_GPL(mce_log);
>
> -/*
> - * We run the default notifier if we have only the UC, the first and the
> - * default notifier registered. I.e., the mandatory NUM_DEFAULT_NOTIFIERS
> - * notifiers registered on the chain.
> - */
> -#define NUM_DEFAULT_NOTIFIERS 3
> -static atomic_t num_notifiers;
> -

I definitely like where this is going.

Another thing: what do we do if we have to deviate from that sequantial
path through the notifiers? What if notifier A gets to look at an error,
then another notifier B needs to look at it and then the information
obtained from the second notifier B, is needed by the first notifier A
again to inspect the error a *second* time.

I don't think there's a case like that now but I'm just playing the
devil's advocate here. Because a use case like that would break our
simplistic, sequential assembly line of MCE decoding.

Thx.

--
Regards/Gruss,
Boris.

https://people.kernel.org/tglx/notes-about-netiquette