Re: NMIs reported by console_blast.sh with 6.6.20-rt25

From: Sebastian Andrzej Siewior
Date: Tue Apr 02 2024 - 06:35:41 EST


On 2024-03-27 19:44:20 [-0400], John B. Wyatt IV wrote:
> > where is this output from? The `ret' opcode usually does not cause a
> > trap. My guess is that the machine has been interrupted by an external
> > user at this position.
>
> Just before the sysrq that crashes the system.

so this is intentional.


> > Side note: This is using early_printk, correct?
>
> I believe so, but it might be preempted? This is the part it stopped in.
>
> static void io_serial_out(unsigned long addr, int offset, int value)
> {
> outb(value, addr + offset);
> }

The function is invoked in NMU context so it can't be preempted.

> > According to this, someone issued a `crash' via sysrq. Why?
> >
>
> This is part of the console_blast.sh script that John Ogness sent me.
>
> Please see below:


Okay. Then everything works as it should…

> > > NMI Backtrace for 6.6.20-rt25 no forced preemption with tuned throughput-performance profile
> > > -----------------------------
> >
> > This and the following backtrace shows the same picture: The CPU is
> > crashing due to proc/sysrq request and does CPU-backtraces via NMI and
> > polls in early_printk, waiting for the UART to become idle (probably).
> >
> > I don't see an issue here so far.
>
> Luis Goncalves discussed it with me after reading your response. Thank
> you for your help. The NMI was needed to flush the buffers upon the
> system crashing itself. Does this part about NMI watchdog need to be
> documented?

Not sure about that one. There is an _a_ _lot_ to be printed from NMI
and the NMI watchdog might trigger if nothing is triggering the
NMI-watchdog during the print job. Also, the crash was requested.

Sebastian