Re: [RFC][PATCHv4 0/6] printk: use printk_safe to handle printk() recursive calls
From: Sergey Senozhatsky
Date: Fri Oct 28 2016 - 00:05:48 EST
Hello,
On (10/27/16 20:30), Linus Torvalds wrote:
> On Thu, Oct 27, 2016 at 8:49 AM, Sergey Senozhatsky
> <sergey.senozhatsky@xxxxxxxxx> wrote:
> >
> > RFC
> >
> > This patch set extends a lock-less NMI per-cpu buffers idea to
> > handle recursive printk() calls. The basic mechanism is pretty much the
> > same -- at the beginning of a deadlock-prone section we switch to lock-less
> > printk callback, and return back to a default printk implementation at the
> > end; the messages are getting flushed to a logbuf buffer from a safer
> > context.
>
> This looks very reasonable to me.
>
> Does this also obviate the need for "printk_deferred()" that the
> scheduler and the clock code uses? Because that would be a lovely
> thing to look at if it doesn't..
I wish I could say that we can retire printk_deferred(), but no, we still
need it. it's rather simple to fix printk recursion (that's what the patch
set is doing), but printk deadlocks are much harder to handle. anything that
starts somewhere else but somehow is related printk will deadlock (in the
worst case). I use this backtrace as an example:
SyS_ioctl
do_vfs_ioctl
tty_ioctl
n_tty_ioctl
tty_mode_ioctl
set_termios
tty_set_termios
uart_set_termios
uart_change_speed
FOO_serial_set_termios
spin_lock_irqsave(&port->lock) // lock the output port
....
!! WARN() or pr_err() or printk()
vprintk_emit()
/* console_trylock() */
console_unlock()
call_console_drivers()
FOO_write()
spin_lock_irqsave(&port->lock) // already locked
with the current printk we can't tell for sure how many locks will
be acquired -- printk() can succeed in locking the console_sem and
start invoking console drivers (if any) from console_unlock(), or
it can fail thus we will acquire only logbuf spin_lock and console_sem
spin_lock.
the things can change *a bit* once we switch to async_printk. because
instead of doing console_unlock()->call_console_drivers(), printk()
will just wake_up() the printk_kthread. but still, it won't be enough
to remove printk_deferred() :(
vprintk_emit()
wake_up()
spin_lock rq lock
printk
will be safe. but
wake_up()
spin_lock rq lock
printk
vprintk_emit()
wake_up()
spin_lock rq lock
will deadlock.
we can't even tell for sure what locks are "important" to printk().
a small and reasonable code refactoring somewhere in clock code/etc.
can accidentally change the whole picture by introducing "unsafe"
WARN_ON() or adding yet another lock to the printing path.
need to think more.
p.s.
we are plannig to discuss printk related issues next week in Santa Fe.
-ss