Re: [PATCH] printk/tracing: Do not trace printk_nmi_enter()

From: Sergey Senozhatsky
Date: Wed Sep 05 2018 - 22:31:59 EST


Hello,

On (09/05/18 21:33), Steven Rostedt wrote:
> do_idle {
>
> [interrupts enabled]
>
> <interrupt> [interrupts disabled]
> TRACE_IRQS_OFF [lockdep says irqs off]
> [...]
> TRACE_IRQS_IRET
> test if pt_regs say return to interrupts enabled [yes]
> TRACE_IRQS_ON [lockdep says irqs are on]
>
> <nmi>
> nmi_enter() {
> printk_nmi_enter() [traced by ftrace]
> [ hit ftrace breakpoint ]
> <breakpoint exception>
> TRACE_IRQS_OFF [lockdep says irqs off]
> [...]
> TRACE_IRQS_IRET [return from breakpoint]
> test if pt_regs say interrupts enabled [no]
> [iret back to interrupt]
> [iret back to code]
>
> tick_nohz_idle_enter() {
>
> lockdep_assert_irqs_enabled() [lockdep say no!]
>
> Although interrupts are indeed enabled, lockdep thinks it is not, and since
> we now do asserts via lockdep, it gives a false warning. The issue here is
> that printk_nmi_enter() is called before lockdep_off(), which disables
> lockdep (for this reason) in NMIs. By simply not allowing ftrace to see
> printk_nmi_enter() (via notrace annotation) we keep lockdep from getting
> confused.

Great catch and I like the patch!

Indeed, with printk_nmi we changed the "everything that nmi_enter does
should happen after lockdep_off" to "everything that nmi_enter does should
happen after printk_nmi_enter" // +similar change to nmi_exit.

An alternative option, thus, could be re-instating back the rule that
lockdep_off/on should be the first and the last thing we do in
nmi_enter/nmi_exit. E.g.

nmi_enter()
lockdep_off();
printk_nmi_enter();

nmi_exit()
printk_nmi_exit();
lockdep_on();

I guess that we can keep printk_nmi_enter/printk_nmi_exit at the top
and at the bottom of nmi_enter/nmi_exit correspondingly just in case
if lockdep_off/lockdep_on sometime in the future starts invoking printk(),
which would deadlock us. Hence

Acked-by: Sergey Senozhatsky <sergey.senozhatsky@xxxxxxxxx>


> Cc: stable@xxxxxxxxxxxxxxx
> Fixes: 099f1c84c0052 ("printk: introduce per-cpu safe_print seq buffer")

A minor nitpick,

printk_nmi was introduced by 42a0bb3f71383b457 ("printk/nmi: generic
solution for safe printk in NMI") - in that commit we declared a new
printk_nmi_enter/exit should be the first and the last thing we do in
nmi_enter/nmi_exit rule.

-ss