Re: printk() from NMI backtrace can delay a lot

From: Sergey Senozhatsky
Date: Mon Jul 02 2018 - 06:39:29 EST


Hi,

On (07/02/18 19:26), Tetsuo Handa wrote:
> Hello.
>
> Today I was testing conditions when/how stall watchdog fires. I noticed that
> printing NMI backtraces to consoles is delayed till IRQ is enabled or somebody
> else schedules printk(). This is not a welcomed behavior when the cause of
> lock up is doing nearly-infinite loop with IRQ disabled. Can we improve this?

Hmm. We can't call console drivers from NMI, this can deadlock on
uart/etc locks. So we always need [except for panic()] someone else to
print NMI message for us. Either it's IRQ on a local CPU (we need two
IRQs actually - one to flush printk_nmi buffer and the second one to do
console_trylock()->console_unlock()), or printk() from another CPU that
would print pending logbuf entries. We used to have a fast path for
print_nmi messages (direct_nmi), which soon will be used only for
NMI->ftrace_dump(). Even if we re-introduce that fast path for printk_nmi
[may be we can do printk_direct_nmi type of checks for printk_nmi as well]
we still can't print anything from the NMI CPU.

I need to look more at the data you have provided.

-ss