Re: [PATCH v4 0/4] arm64: cross-CPU NMI via SDEI

From: Kiryl Shutsemau

Date: Mon Jun 29 2026 - 09:42:37 EST


On Fri, Jun 26, 2026 at 08:40:57PM +0100, Kiryl Shutsemau wrote:
> But I have not tried calling CPU_OFF directly, without completing the
> event. I assumed it is required. Will give it a try when I have time.

Tried it now, and it doesn't work either -- in a more interesting way.

Calling PSCI CPU_OFF directly from the SDEI handler (event left
uncompleted) reproducibly breaks the kdump capture kernel, and this
reproduces under QEMU's TF-A, not just on Grace -- so it isn't a Grace
firmware quirk.

The test: a CPU wedged with interrupts masked is stopped via the SDEI
rung; its handler calls __cpu_try_die() instead of parking. A/B in QEMU,
changing only that wedged CPU's handling (everything else identical):

- park it (current series): capture kernel boots fully to a shell.
- CPU_OFF from the handler: capture kernel hangs in early boot, around
SDEI re-init, never reaches a shell.

Powering the PE off while its SDEI event is still active leaves EL3's
dispatch state dangling, and the capture kernel trips over it. Completing
the event first and then CPU_OFF -- what I tried originally -- silently
wedges EL3 on Grace instead.

So both routes off fail, and the CPU stays parked. The dump is complete
either way; only re-onlining the stopped CPU in an SMP capture kernel is
lost. It's a cheap QEMU repro now if anyone wants to dig into the EL3
side.

--
Kiryl Shutsemau / Kirill A. Shutemov