Re: [PATCH v2] x86/virt: Silence RCU lockdep splat in emergency virt callback path

From: Sean Christopherson

Date: Thu May 07 2026 - 18:00:00 EST


On Tue, May 05, 2026, Mikhail Gavrilov wrote:
> x86_virt_invoke_kvm_emergency_callback() reaches rcu_dereference()
> through machine_crash_shutdown() with IRQs disabled but with RCU not
> necessarily watching the crashing CPU, which triggers a suspicious
> RCU usage splat on debug kernels (CONFIG_PROVE_RCU=y) during
> panic/kdump:
>
> WARNING: suspicious RCU usage
> arch/x86/virt/hw.c:52 suspicious rcu_dereference_check() usage!
>
> rcu_scheduler_active = 2, debug_locks = 1
> 1 lock held by tee/11119:
> #0: ffff8881fa32c440 (sb_writers#3){.+.+}-{0:0}, at: ksys_write
>
> Call Trace:
> <TASK>
> dump_stack_lvl+0x84/0xd0
> lockdep_rcu_suspicious.cold+0x37/0x8f
> x86_virt_invoke_kvm_emergency_callback+0x5f/0x70
> x86_svm_emergency_disable_virtualization_cpu+0x2a/0x30
> x86_virt_emergency_disable_virtualization_cpu+0x6b/0x90
> native_machine_crash_shutdown+0x72/0x170
> __crash_kexec+0x137/0x280
> panic+0xce/0xd0
> sysrq_handle_crash+0x1f/0x20
> __handle_sysrq.cold+0x192/0x335
> write_sysrq_trigger+0x8c/0xc0
> proc_reg_write+0x1c3/0x3c0
> vfs_write+0x1d0/0xf80
> ksys_write+0x116/0x250
> do_syscall_64+0x11c/0x1480
> entry_SYSCALL_64_after_hwframe+0x76/0x7e
> </TASK>
>
> A truly correct fix is non-trivial: the RCU usage genuinely is wrong in
> panic context (RCU may ignore the crashing CPU during synchronization),
> and a concurrent KVM module unload could in principle race with the
> callback read; see commit 2baa33a8ddd6 ("KVM: x86: Leave user-return
> notifier registered on reboot/shutdown") which notes that nothing
> prevents module unload during panic/reboot.
>
> However, the alternatives are worse:
>
> - smp_store_release()/smp_load_acquire() handles ordering but not
> liveness; the kernel still needs to keep the module text alive
> while the callback is in flight.
> - Taking a lock in the panic path is risky — any lock could be held
> by a CPU that has already been NMI'd to a halt.
>
> Use rcu_dereference_raw() to silence the splat and accept the
> vanishingly small remaining race. Panic context inherently cannot
> guarantee complete correctness; the goal here is to keep debug builds
> quiet on the kdump path so the splat doesn't obscure the actual
> kernel state being captured.
>
> Reproducible on a debug kernel (CONFIG_PROVE_LOCKING=y, CONFIG_PROVE_RCU=y)
> with kvm_amd or kvm_intel loaded by triggering kdump:
>
> echo c > /proc/sysrq-trigger
>
> Suggested-by: Sean Christopherson <seanjc@xxxxxxxxxx>
> Fixes: 428afac5a8ea ("KVM: x86: Move bulk of emergency virtualizaton logic to virt subsystem")
> Signed-off-by: Mikhail Gavrilov <mikhail.v.gavrilov@xxxxxxxxx>
> ---

Acked-by: Sean Christopherson <seanjc@xxxxxxxxxx>

(I can also take this through kvm-x86; I have no preference whatsoever)