[RFC] How to test panic handlers, without crashing the kernel
From: Jocelyn Falempe
Date: Fri Mar 01 2024 - 06:15:40 EST
Hi,
While writing a panic handler for drm devices [1], I needed a way to
test it without crashing the machine.
So from debugfs, I called
atomic_notifier_call_chain(&panic_notifier_list, ...), but it has the
side effect of calling all other panic notifiers registered.
So Sima suggested to move that to the generic panic code, and test all
panic notifiers with a dedicated debugfs interface.
I can move that code to kernel/, but before doing that, I would like to
know if you think that's the right way to test the panic code.
The second question is how to simulate a panic context in a
non-destructive way, so we can test the panic notifiers in CI, without
crashing the machine. The worst case for a panic notifier, is when the
panic occurs in NMI context, but I don't know how to simulate that. The
goal would be to find early if a panic notifier tries to sleep, or do
other things that are not allowed in a panic context.
Best regards,
--
Jocelyn
[1] https://patchwork.freedesktop.org/patch/580183/?series=122244&rev=8