[RFC] How to test panic handlers, without crashing the kernel

From: Jocelyn Falempe
Date: Fri Mar 01 2024 - 06:15:40 EST

Next message: Barry Song: "Re: [PATCH v6] crypto: scompress: remove memcpy if sg_nents is 1 and pages are lowmem"
Previous message: Andrew Jones: "Re: [PATCH v4 04/15] RISC-V: Add SBI PMU snapshot definitions"
Next in thread: John Ogness: "Re: [RFC] How to test panic handlers, without crashing the kernel"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

Hi,

While writing a panic handler for drm devices [1], I needed a way to test it without crashing the machine.
So from debugfs, I called atomic_notifier_call_chain(&panic_notifier_list, ...), but it has the side effect of calling all other panic notifiers registered.

So Sima suggested to move that to the generic panic code, and test all panic notifiers with a dedicated debugfs interface.

I can move that code to kernel/, but before doing that, I would like to know if you think that's the right way to test the panic code.

The second question is how to simulate a panic context in a non-destructive way, so we can test the panic notifiers in CI, without crashing the machine. The worst case for a panic notifier, is when the panic occurs in NMI context, but I don't know how to simulate that. The goal would be to find early if a panic notifier tries to sleep, or do other things that are not allowed in a panic context.

Best regards,

--

Jocelyn

[1] https://patchwork.freedesktop.org/patch/580183/?series=122244&rev=8

Next message: Barry Song: "Re: [PATCH v6] crypto: scompress: remove memcpy if sg_nents is 1 and pages are lowmem"
Previous message: Andrew Jones: "Re: [PATCH v4 04/15] RISC-V: Add SBI PMU snapshot definitions"
Next in thread: John Ogness: "Re: [RFC] How to test panic handlers, without crashing the kernel"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]