Re: NMI watchdog dump does not print on hard lockup
From: Sergey Senozhatsky
Date: Tue Oct 23 2018 - 02:49:13 EST
On (10/16/17 10:15), Steven Rostedt wrote:
> On Mon, 16 Oct 2017 22:13:05 +0900
> Sergey Senozhatsky <sergey.senozhatsky@xxxxxxxxx> wrote:
>
> > just "brainstorming" it... with some silly ideas.
> >
> > pushing the data from NMI panic might look like we are replacing one
> > deadlock scenario with another deadlock scenario. some of the console
> > drivers are soooo complex internally. so I have been thinking about...
> > may be we can extend struct console and add ->write_on_panic() and that
> > handler must be as lockless as possible; so lockless that calling it
> > from anything that is not panic() is a severe bug.
>
> This may not be a bad idea. And make it so it can't be called unless we
> are in panic mode (or at least "oops in progress").
>
> If oops_in_progress is set, and the console has a "write_on_panic"
> handler, then just call that.
Good news Steven.
It turned out that some of serial consoles already have this
write_on_panic() mechanism enabled. Such consoles have the following
thing is their usual ->write() callbacks (which we call from printk()):
static void serial_console_write(struct console *co, const char *s,
unsigned count)
{
...
if (port->sysrq)
locked = 0;
else if (oops_in_progress)
locked = spin_trylock_irqsave(&port->lock, flags);
else
spin_lock_irqsave(&port->lock, flags);
...
uart_console_write(port, s, count, serial_console_putchar);
...
if (locked)
spin_unlock_irqrestore(&port->lock, flags);
}
Notice the special handling of port->sysrq and oops_in_progress cases.
So we, basically, already have "lockless on panic" serial consoles.
The problem is - it seems that panic() does not always let lockless
consoles to be lockless. I'm trying to address this in [1].
[1] lkml.kernel.org/r/20181016050428.17966-2-sergey.senozhatsky@xxxxxxxxx
-ss