On Tue, Jan 10, 2023 at 08:32:07PM +0800, Zeng Heng wrote:
mce is registered on NMI handler by inject_init().That's a handler for the NMI raised by raise_mce(). That's for the injection
case, which is simulated. If you're fixing the injection case, then surely not
with a bogus boot NMI handler.
Yes, exactly. The following procedure is like:I'm doubtful now as you're injecting errors so you're not really in #MC context
panic() -> relocate_kernel() -> identity_mapped() -> x86 purgatory image ->
EFI loader -> secondary kernel
but in this contrived context which is actually an NMI one. So we need to think
about how to fix this case.
Certainly not with an empty NMI handler...
Regardless, we should do
diff --git a/arch/x86/kernel/cpu/mce/core.c b/arch/x86/kernel/cpu/mce/core.c
index 7832a69d170e..57fe376ed049 100644
--- a/arch/x86/kernel/cpu/mce/core.c
+++ b/arch/x86/kernel/cpu/mce/core.c
@@ -286,6 +286,8 @@ static noinstr void mce_panic(const char *msg, struct mce *final, char *exp)
if (!fake_panic) {
if (panic_timeout == 0)
panic_timeout = mca_cfg.panic_timeout;
+
+ mce_wrmsrl(MSR_IA32_MCG_STATUS, 0);
panic(msg);
} else
pr_emerg(HW_ERR "Fake kernel panic: %s\n", msg);
so that we not run kexec in #MC context.
Hmmm.