Re: NMI watchdog dump does not print on hard lockup

From: Sergey Senozhatsky
Date: Mon Oct 16 2017 - 09:13:18 EST


Hello,

On (10/16/17 13:12), Petr Mladek wrote:
[..]
> > I think an NMI watchdog should just force the flush - the same way an
> > oops should. Deadlocks aren't really relevant if something doesn't get
> > printed out anyway.
>
> We expicititely flush the NMI buffers in panic() when there is
> not other way to see them. But it is questional in other situations.
> Sometimes the flush might be the only way to see the messages
> and sometimes printk() might unnecessarily cause a deadlock.
> IMHO, the only solution is to make it optional.

just "brainstorming" it... with some silly ideas.

pushing the data from NMI panic might look like we are replacing one
deadlock scenario with another deadlock scenario. some of the console
drivers are soooo complex internally. so I have been thinking about...
may be we can extend struct console and add ->write_on_panic() and that
handler must be as lockless as possible; so lockless that calling it
from anything that is not panic() is a severe bug.

an absolutely trivial case,
if serial console does

console_write_cb(struct console *co, const char *s, unsigned int count)
{
spin_lock_irqsave(&port->lock, flags);
uart_console_write(s, count, console_putchar);
spin_unlock_irqrestore(&port->lock, flags);
}

then panic callback can look like

console_write_on_panic_cb(struct console *co, const char *s, unsigned int count)
{
/* no, we don't take the port lock here */
uart_console_write(s, count, console_putchar);
}

a less trivial case might look more involved. but in general that
write_on_panic() callback must do the absolute minimum of work. so
it's sort of a early console, but as part of normal console driver.

I also got some other serial console crazy ideas, but they are not
related to this topic.

-ss