Re: [PATCH 2/2] x86/mce: Dump the stack for recoverable machine checks in kernel context

From: Borislav Petkov
Date: Mon Oct 31 2022 - 12:44:57 EST


On Thu, Sep 22, 2022 at 12:51:36PM -0700, Tony Luck wrote:
> @@ -254,6 +255,9 @@ static noinstr void mce_panic(const char *msg, struct mce *final, char *exp)
> wait_for_panic();
> barrier();
>
> + if (final->severity == MCE_PANIC_STACKDUMP_SEVERITY)
> + show_stack(NULL, NULL, KERN_DEFAULT);

So this is kinda weird, IMO:

1. If the error has raised a MCE, then we will dump stack anyway.

2. If the error is the result of consuming poison or some other deferred
type which doesn't raise an exception immediately, then we have missed
it because we don't have the stack at the time the error got detected by
the hardware.

3. If all you wanna do is avoid useless stack traces, you can simply
ignore them. :)

IOW, it will dump stack in the cases we're interested in and it will
dump stack in a couple of other PANIC cases. So? We simply ignore the
latter.

But I don't see the point of adding code just so that we can suppress
the uninteresting ones...

--
Regards/Gruss,
Boris.

https://people.kernel.org/tglx/notes-about-netiquette