[PATCH 0/2] x86/crash: Fix double list_add nmi_shootdown bug

From: Sean Christopherson
Date: Wed May 11 2022 - 19:43:51 EST


Fix a double list_add() bug found and debugged by Guilherme, who did all
the hard work. nmi_shootdown_cpus() doesn't play nice with being called
more than once. With the "right" kexec/kdump configuration,
emergency_vmx_disable_all() can be reached after kdump_nmi_shootdown_cpus()
(the two users of nmi_shootdown_cpus()).

My solution is to turn the emergency_vmx_disable_all() shootdown into a
nop of sorts, and move the disabling of virtualization into the core
crash_nmi_callback() handler. The only thing emergency_vmx_disable_all()
cares about is disabling VMX/SVM (obviously), and since I can't envision a
use case for an NMI shootdown that doesn't want to disable virtualization,
doing that in the core handler means emergency_vmx_disable_all() only
needs to ensure _a_ shootdown occurs, it doesn't care when that shootdown
happened or what callback was run.

This obviously punts on making nmi_shootdown_cpus() truly multi-caller
friendly, but notifier chains tend to be messy, and it's not obvious to
me what would be the desired/correct behavior for a true multi-shootdown
use case.

Patch 2 is a related bug fix found while exploring ideas for patch 1.

Sean Christopherson (2):
x86/crash: Disable virt in core NMI crash handler to avoid double
list_add
x86/reboot: Disable virtualization in an emergency if SVM is supported

arch/x86/include/asm/reboot.h | 1 +
arch/x86/kernel/crash.c | 16 +--------
arch/x86/kernel/reboot.c | 64 +++++++++++++++++++++++++++--------
3 files changed, 51 insertions(+), 30 deletions(-)


base-commit: feb9c5e19e913b53cb536a7aa7c9f20107bb51ec
--
2.36.0.512.ge40c2bad7a-goog