Re: tty: deadlock between n_tracerouter_receivebuf and flush_to_ldisc

From: Dmitry Vyukov
Date: Fri Jan 15 2016 - 02:52:54 EST


On Wed, Dec 30, 2015 at 11:44 AM, Dmitry Vyukov <dvyukov@xxxxxxxxxx> wrote:
> Hello,
>
> I've got the following lock order inversion report on
> 8513342170278468bac126640a5d2d12ffbff106 (Dec 28). Unfortunately I
> failed to find a reproducer for it.

Ping. This is the most frequent failure during my testing now. J
Freyensee, you seem to wrote drivers/tty/n_tracerouter.c. Please take
a look. Thanks.


> ======================================================
> [ INFO: possible circular locking dependency detected ]
> 4.4.0-rc7+ #181 Not tainted
> -------------------------------------------------------
> kworker/u9:2/35 is trying to acquire lock:
> (routelock){+.+...}, at: [<ffffffff82bc77c0>]
> n_tracerouter_receivebuf+0x20/0x40 drivers/tty/n_tracerouter.c:175
>
> but task is already holding lock:
> (&buf->lock){+.+...}, at: [<ffffffff82baf791>]
> flush_to_ldisc+0xe1/0x7f0 drivers/tty/tty_buffer.c:487
>
> which lock already depends on the new lock.
>
> the existing dependency chain (in reverse order) is:
>
> -> #3 (&buf->lock){+.+...}:
> [<ffffffff813f0acf>] lock_acquire+0x19f/0x3c0
> kernel/locking/lockdep.c:3585
> [< inline >] __raw_spin_lock_irqsave
> include/linux/spinlock_api_smp.h:112
> [<ffffffff85c8e790>] _raw_spin_lock_irqsave+0x50/0x70
> kernel/locking/spinlock.c:159
> [<ffffffff82b8c050>] tty_get_pgrp+0x20/0x80 drivers/tty/tty_io.c:2502
> [<ffffffff82b9a09a>] __isig+0x1a/0x50 drivers/tty/n_tty.c:1112
> [<ffffffff82b9c16e>] isig+0xae/0x2c0 drivers/tty/n_tty.c:1131
> [<ffffffff82b9ef02>] n_tty_receive_signal_char+0x22/0xf0
> drivers/tty/n_tty.c:1243
> [<ffffffff82ba4958>] n_tty_receive_char_special+0x1278/0x2bf0
> drivers/tty/n_tty.c:1289
> [< inline >] n_tty_receive_buf_fast drivers/tty/n_tty.c:1613
> [< inline >] __receive_buf drivers/tty/n_tty.c:1647
> [<ffffffff82ba7ca6>] n_tty_receive_buf_common+0x19d6/0x2450
> drivers/tty/n_tty.c:1745
> [<ffffffff82ba8753>] n_tty_receive_buf2+0x33/0x40
> drivers/tty/n_tty.c:1780
> [< inline >] receive_buf drivers/tty/tty_buffer.c:450
> [<ffffffff82bafa6f>] flush_to_ldisc+0x3bf/0x7f0
> drivers/tty/tty_buffer.c:517
> [<ffffffff8133833c>] process_one_work+0x76c/0x13e0
> kernel/workqueue.c:2030
> [<ffffffff81339093>] worker_thread+0xe3/0xe90 kernel/workqueue.c:2162
> [<ffffffff8134b63f>] kthread+0x23f/0x2d0 drivers/block/aoe/aoecmd.c:1303
> [<ffffffff85c8eeef>] ret_from_fork+0x3f/0x70
> arch/x86/entry/entry_64.S:468
>
> -> #2 (&o_tty->termios_rwsem/1){++++..}:
> [<ffffffff813f0acf>] lock_acquire+0x19f/0x3c0
> kernel/locking/lockdep.c:3585
> [<ffffffff85c899cc>] down_read+0x4c/0x70 kernel/locking/rwsem.c:22
> [<ffffffff82ba635d>] n_tty_receive_buf_common+0x8d/0x2450
> drivers/tty/n_tty.c:1708
> [<ffffffff82ba8753>] n_tty_receive_buf2+0x33/0x40
> drivers/tty/n_tty.c:1780
> [< inline >] receive_buf drivers/tty/tty_buffer.c:450
> [<ffffffff82bafa6f>] flush_to_ldisc+0x3bf/0x7f0
> drivers/tty/tty_buffer.c:517
> [<ffffffff8133833c>] process_one_work+0x76c/0x13e0
> kernel/workqueue.c:2030
> [<ffffffff81339093>] worker_thread+0xe3/0xe90 kernel/workqueue.c:2162
> [<ffffffff8134b63f>] kthread+0x23f/0x2d0 drivers/block/aoe/aoecmd.c:1303
> [<ffffffff85c8eeef>] ret_from_fork+0x3f/0x70
> arch/x86/entry/entry_64.S:468
>
> -> #1 (&port->buf.lock/1){+.+...}:
> [<ffffffff813f0acf>] lock_acquire+0x19f/0x3c0
> kernel/locking/lockdep.c:3585
> [< inline >] __mutex_lock_common kernel/locking/mutex.c:518
> [<ffffffff85c86ba6>] mutex_lock_nested+0xb6/0x940
> kernel/locking/mutex.c:618
> [<ffffffff82bc77ff>] n_tracerouter_open+0x1f/0x210
> drivers/tty/n_tracerouter.c:75
> [<ffffffff82bacd78>] tty_ldisc_open.isra.2+0x78/0xd0
> drivers/tty/tty_ldisc.c:447
> [<ffffffff82bad43a>] tty_set_ldisc+0x1ca/0xa30
> drivers/tty/tty_ldisc.c:567
> [< inline >] tiocsetd drivers/tty/tty_io.c:2650
> [<ffffffff82b9483a>] tty_ioctl+0xb2a/0x2160 drivers/tty/tty_io.c:2883
> [< inline >] vfs_ioctl fs/ioctl.c:43
> [<ffffffff81750d41>] do_vfs_ioctl+0x681/0xe40 fs/ioctl.c:607
> [< inline >] SYSC_ioctl fs/ioctl.c:622
> [<ffffffff8175158f>] SyS_ioctl+0x8f/0xc0 fs/ioctl.c:613
> [<ffffffff85c8eb36>] entry_SYSCALL_64_fastpath+0x16/0x7a
> arch/x86/entry/entry_64.S:185
>
> -> #0 (routelock){+.+...}:
> [< inline >] check_prev_add kernel/locking/lockdep.c:1853
> [< inline >] check_prevs_add kernel/locking/lockdep.c:1958
> [< inline >] validate_chain kernel/locking/lockdep.c:2144
> [<ffffffff813ee02e>] __lock_acquire+0x345e/0x3ca0
> kernel/locking/lockdep.c:3206
> [<ffffffff813f0acf>] lock_acquire+0x19f/0x3c0
> kernel/locking/lockdep.c:3585
> [< inline >] __mutex_lock_common kernel/locking/mutex.c:518
> [<ffffffff85c86ba6>] mutex_lock_nested+0xb6/0x940
> kernel/locking/mutex.c:618
> [<ffffffff82bc77c0>] n_tracerouter_receivebuf+0x20/0x40
> drivers/tty/n_tracerouter.c:175
> [< inline >] receive_buf drivers/tty/tty_buffer.c:454
> [<ffffffff82bafc34>] flush_to_ldisc+0x584/0x7f0
> drivers/tty/tty_buffer.c:517
> [<ffffffff8133833c>] process_one_work+0x76c/0x13e0
> kernel/workqueue.c:2030
> [<ffffffff81339093>] worker_thread+0xe3/0xe90 kernel/workqueue.c:2162
> [<ffffffff8134b63f>] kthread+0x23f/0x2d0 drivers/block/aoe/aoecmd.c:1303
> [<ffffffff85c8eeef>] ret_from_fork+0x3f/0x70
> arch/x86/entry/entry_64.S:468
>
> other info that might help us debug this:
>
> Chain exists of:
> routelock --> &o_tty->termios_rwsem/1 --> &buf->lock
>
> Possible unsafe locking scenario:
>
> CPU0 CPU1
> ---- ----
> lock(&buf->lock);
> lock(&o_tty->termios_rwsem/1);
> lock(&buf->lock);
> lock(routelock);
>
> *** DEADLOCK ***
>
> 4 locks held by kworker/u9:2/35:
> #0: ("events_unbound"){.+.+.+}, at: [< inline >]
> __write_once_size include/linux/compiler.h:247
> #0: ("events_unbound"){.+.+.+}, at: [< inline >]
> atomic64_set ./arch/x86/include/asm/atomic64_64.h:33
> #0: ("events_unbound"){.+.+.+}, at: [< inline >]
> atomic_long_set include/asm-generic/atomic-long.h:56
> #0: ("events_unbound"){.+.+.+}, at: [< inline >]
> set_work_data kernel/workqueue.c:614
> #0: ("events_unbound"){.+.+.+}, at: [< inline >]
> set_work_pool_and_clear_pending kernel/workqueue.c:641
> #0: ("events_unbound"){.+.+.+}, at: [<ffffffff81338242>]
> process_one_work+0x672/0x13e0 kernel/workqueue.c:2023
> #1: ((&buf->work)){+.+...}, at: [<ffffffff81338272>]
> process_one_work+0x6a2/0x13e0 kernel/workqueue.c:2027
> #2: (&tty->ldisc_sem){++++++}, at: [<ffffffff82bac99b>]
> tty_ldisc_ref+0x1b/0x80 drivers/tty/tty_ldisc.c:283
> #3: (&buf->lock){+.+...}, at: [<ffffffff82baf791>]
> flush_to_ldisc+0xe1/0x7f0 drivers/tty/tty_buffer.c:487
>
> stack backtrace:
> CPU: 0 PID: 35 Comm: kworker/u9:2 Not tainted 4.4.0-rc7+ #181
> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
> Workqueue: events_unbound flush_to_ldisc
> 00000000ffffffff ffff88003d8e77f8 ffffffff8289d9dd ffffffff8866fbf0
> ffffffff8863f050 ffffffff886422f0 ffff88003d8e7840 ffffffff813e7081
> ffff88003d8e7930 000000003d8b1780 ffff88003d8b1fc0 ffff88003d8b1fe2
> Call Trace:
> [< inline >] validate_chain kernel/locking/lockdep.c:2144
> [<ffffffff813ee02e>] __lock_acquire+0x345e/0x3ca0 kernel/locking/lockdep.c:3206
> [<ffffffff813f0acf>] lock_acquire+0x19f/0x3c0 kernel/locking/lockdep.c:3585
> [< inline >] __mutex_lock_common kernel/locking/mutex.c:518
> [<ffffffff85c86ba6>] mutex_lock_nested+0xb6/0x940 kernel/locking/mutex.c:618
> [<ffffffff82bc77c0>] n_tracerouter_receivebuf+0x20/0x40
> drivers/tty/n_tracerouter.c:175
> [< inline >] receive_buf drivers/tty/tty_buffer.c:454
> [<ffffffff82bafc34>] flush_to_ldisc+0x584/0x7f0 drivers/tty/tty_buffer.c:517
> [<ffffffff8133833c>] process_one_work+0x76c/0x13e0 kernel/workqueue.c:2030
> [<ffffffff81339093>] worker_thread+0xe3/0xe90 kernel/workqueue.c:2162
> [<ffffffff8134b63f>] kthread+0x23f/0x2d0 drivers/block/aoe/aoecmd.c:1303
> [<ffffffff85c8eeef>] ret_from_fork+0x3f/0x70 arch/x86/entry/entry_64.S:468
> ======================================================