Re: [GIT PULL] printk for 4.13

From: Sergey Senozhatsky
Date: Wed Jul 05 2017 - 20:44:01 EST

Next message: Linus Torvalds: "Re: [PATCH] mm: larger stack guard gap, between vmas"
Previous message: Jin, Yao: "Re: [PATCH v3 0/2] perf report: Implement visual marker for macro fusion in annotate"
In reply to: Linus Torvalds: "Re: [GIT PULL] printk for 4.13"
Next in thread: Sergey Senozhatsky: "Re: [GIT PULL] printk for 4.13"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

Hello,

On (07/05/17 11:23), Linus Torvalds wrote:
> On Mon, Jul 3, 2017 at 7:24 AM, Petr Mladek <pmladek@xxxxxxxx> wrote:
> >
> > - Store printk() messages into the main log buffer directly even in NMI
> > when the lock is available. It is the best effort to print even large
> > chunk of text. It is handy, for example, when all ftrace messages
> > are printed during the system panic in NMI.
>
> No, this is entirely buggered.
>
> You can't just do
>
> raw_spin_is_locked(&logbuf_lock)
>
> to test whether you can get the logbuf_lock and then call vprintk_default().
>
> It's not just about deadlock avoidance, the code will call things like
> down_trylock() on the console semaphore, and that operation is
> fundamentally not NMI-safe. The semaphore count is literally protected
> by a irq-safe (BUT NOT NMI-SAFE!) semaphore spinlock.
>
> So now you can instead deadlock just on the internal console semaphore
> spinlock (ie somebody is doing "console_lock()", NMI comes in, *BOOM*.

you are right. but

PRINTK_NMI_DEFERRED_CONTEXT_MASK calls into vprintk_deferred(), which
does vprintk_emit(0, LOGLEVEL_SCHED, NULL, 0, fmt, args). and
vprintk_emit() LOGLEVEL_SCHED is nothing, but printk_deferred(), which
never calls into the scheduler and does not even try to lock console_sem.
the only lock it takes is logbuf_lock, and the idea behind those checks
in printk_nmi_enter() is to make sure that we can safely call
vprintk_deferred() from NMI on that CPU.

printk_safe context is per-CPU, so if we see at NMI entry that this_cpu
has PRINTK_SAFE_CONTEXT_MASK set, then it means that NMI is preempting
printk-safe context on this_cpu, which is already alarming, because we
switch to printk-safe context *mostly* in places where printk recursion
can deadlock us (console_sem up/down/try_down and logbuf_lock lock/unlock).

why *mostly*... because we are also in printk-safe context during the
call_console_drivers() call.

so this is why we also check raw_spin_is_locked(&logbuf_lock). by the
time we do raw_spin_is_locked(&logbuf_lock) we already know that
this_cpu is in printk-safe, so it is _potentially_ unsafe to call
vprintk_emit(), unless we are printk-safe for call_console_drivers(),
which means that logbuf_lock is not lockde by this_cpu.

so what printk_nmi_enter() does

- this_cpu_read(printk_context) & PRINTK_SAFE_CONTEXT_MASK

if we are NOT in printk-safe then we know that from this CPU we can
call printk_deferred().

otherwise, we need to make sure that it is safe to call printk_deferred().

- raw_spin_is_locked(&logbuf_lock)

if the lock is not locked -- we can printk_deferred() from that CPU.

otherwise, we use printk-nmi buffer and flush the messages to logbuf
later from irq_work.

does this make sense?

-ss

Next message: Linus Torvalds: "Re: [PATCH] mm: larger stack guard gap, between vmas"
Previous message: Jin, Yao: "Re: [PATCH v3 0/2] perf report: Implement visual marker for macro fusion in annotate"
In reply to: Linus Torvalds: "Re: [GIT PULL] printk for 4.13"
Next in thread: Sergey Senozhatsky: "Re: [GIT PULL] printk for 4.13"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]