Re: possible deadlock in snd_seq_deliver_event

From: Takashi Iwai
Date: Tue Oct 31 2017 - 04:10:55 EST


On Sun, 29 Oct 2017 10:57:58 +0100,
Takashi Iwai wrote:
>
> On Fri, 27 Oct 2017 10:11:18 +0200,
> Dmitry Vyukov wrote:
> >
> > On Fri, Oct 27, 2017 at 10:09 AM, syzbot
> > <bot+7feb8de6b4d6bf810cf098bef942cc387e79d0ad@xxxxxxxxxxxxxxxxxxxxxxxxx>
> > wrote:
> > > Hello,
> > >
> > > syzkaller hit the following crash on
> > > 2bd6bf03f4c1c59381d62c61d03f6cc3fe71f66e
> > > git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next.git/master
> > > compiler: gcc (GCC) 7.1.1 20170620
> > > .config is attached
> > > Raw console output is attached.
> > > C reproducer is attached
> > > syzkaller reproducer is attached. See https://goo.gl/kgGztJ
> > > for information about syzkaller reproducers
> > >
> > >
> > > ============================================
> > > WARNING: possible recursive locking detected
> > > 4.14.0-rc1+ #88 Not tainted
> > > --------------------------------------------
> > > syzkaller883997/2981 is trying to acquire lock:
> > > (&grp->list_mutex){++++}, at: [<ffffffff83d4dd49>] deliver_to_subscribers
> > > sound/core/seq/seq_clientmgr.c:666 [inline]
> > > (&grp->list_mutex){++++}, at: [<ffffffff83d4dd49>]
> > > snd_seq_deliver_event+0x279/0x790 sound/core/seq/seq_clientmgr.c:807
> > >
> > > but task is already holding lock:
> > > (&grp->list_mutex){++++}, at: [<ffffffff83d4dd49>] deliver_to_subscribers
> > > sound/core/seq/seq_clientmgr.c:666 [inline]
> > > (&grp->list_mutex){++++}, at: [<ffffffff83d4dd49>]
> > > snd_seq_deliver_event+0x279/0x790 sound/core/seq/seq_clientmgr.c:807
> > >
> > > other info that might help us debug this:
> > > Possible unsafe locking scenario:
> > >
> > > CPU0
> > > ----
> > > lock(&grp->list_mutex);
> > > lock(&grp->list_mutex);
> > >
> > > *** DEADLOCK ***
> > >
> > > May be due to missing lock nesting notation
> > >
> > > 2 locks held by syzkaller883997/2981:
> > > #0: (register_mutex#4){+.+.}, at: [<ffffffff83d60ada>]
> > > odev_release+0x4a/0x70 sound/core/seq/oss/seq_oss.c:152
> > > #1: (&grp->list_mutex){++++}, at: [<ffffffff83d4dd49>]
> > > deliver_to_subscribers sound/core/seq/seq_clientmgr.c:666 [inline]
> > > #1: (&grp->list_mutex){++++}, at: [<ffffffff83d4dd49>]
> > > snd_seq_deliver_event+0x279/0x790 sound/core/seq/seq_clientmgr.c:807
> > >
> > > stack backtrace:
> > > CPU: 1 PID: 2981 Comm: syzkaller883997 Not tainted 4.14.0-rc1+ #88
> > > Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
> > > Google 01/01/2011
> > > Call Trace:
> > > __dump_stack lib/dump_stack.c:16 [inline]
> > > dump_stack+0x194/0x257 lib/dump_stack.c:52
> > > print_deadlock_bug kernel/locking/lockdep.c:1797 [inline]
> > > check_deadlock kernel/locking/lockdep.c:1844 [inline]
> > > validate_chain kernel/locking/lockdep.c:2453 [inline]
> > > __lock_acquire+0x1232/0x4620 kernel/locking/lockdep.c:3498
> > > lock_acquire+0x1d5/0x580 kernel/locking/lockdep.c:4002
> > > down_read+0x96/0x150 kernel/locking/rwsem.c:23
> > > deliver_to_subscribers sound/core/seq/seq_clientmgr.c:666 [inline]
> > > snd_seq_deliver_event+0x279/0x790 sound/core/seq/seq_clientmgr.c:807
> > > snd_seq_kernel_client_dispatch+0x11e/0x150
> > > sound/core/seq/seq_clientmgr.c:2309
> > > dummy_input+0x2c4/0x400 sound/core/seq/seq_dummy.c:104
> > > snd_seq_deliver_single_event.constprop.11+0x2fb/0x940
> > > sound/core/seq/seq_clientmgr.c:621
> > > deliver_to_subscribers sound/core/seq/seq_clientmgr.c:676 [inline]
> > > snd_seq_deliver_event+0x318/0x790 sound/core/seq/seq_clientmgr.c:807
> > > snd_seq_kernel_client_dispatch+0x11e/0x150
> > > sound/core/seq/seq_clientmgr.c:2309
> > > dummy_input+0x2c4/0x400 sound/core/seq/seq_dummy.c:104
> > > snd_seq_deliver_single_event.constprop.11+0x2fb/0x940
> > > sound/core/seq/seq_clientmgr.c:621
> > > snd_seq_deliver_event+0x12c/0x790 sound/core/seq/seq_clientmgr.c:818
> > > snd_seq_kernel_client_dispatch+0x11e/0x150
> > > sound/core/seq/seq_clientmgr.c:2309
> > > snd_seq_oss_dispatch sound/core/seq/oss/seq_oss_device.h:150 [inline]
> > > snd_seq_oss_midi_reset+0x44b/0x700 sound/core/seq/oss/seq_oss_midi.c:481
> > > snd_seq_oss_synth_reset+0x398/0x980 sound/core/seq/oss/seq_oss_synth.c:416
> > > snd_seq_oss_reset+0x6c/0x260 sound/core/seq/oss/seq_oss_init.c:448
> > > snd_seq_oss_release+0x71/0x120 sound/core/seq/oss/seq_oss_init.c:425
> > > odev_release+0x52/0x70 sound/core/seq/oss/seq_oss.c:153
> > > __fput+0x333/0x7f0 fs/file_table.c:210
> > > ____fput+0x15/0x20 fs/file_table.c:244
> > > task_work_run+0x199/0x270 kernel/task_work.c:112
> > > exit_task_work include/linux/task_work.h:21 [inline]
> > > do_exit+0xa52/0x1b40 kernel/exit.c:865
> > > do_group_exit+0x149/0x400 kernel/exit.c:968
> > > SYSC_exit_group kernel/exit.c:979 [inline]
> > > SyS_exit_group+0x1d/0x20 kernel/exit.c:977
> > > entry_SYSCALL_64_fastpath+0x1f/0xbe
> > > RIP: 0033:0x442c58
> > > RSP: 002b:00007ffd15d4f8d8 EFLAGS: 00000246 ORIG_RAX: 00000000000000e7
> > > RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 0000000000442c58
> > > RDX: 0000000000000000 RSI: 000000000000003c RDI: 0000000000000000
> > > RBP: 0000000000000082 R08: 00000000000000e7 R09: ffffffffffffffd0
> > > R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000401ca0
> > > R13: 0000000000401d30 R14
> >
> > I've just re-reproduced this on upstream
> > 15f859ae5c43c7f0a064ed92d33f7a5bc5de6de0 (Oct 26):
> >
> > ============================================
> > WARNING: possible recursive locking detected
> > 4.14.0-rc6+ #10 Not tainted
> > --------------------------------------------
> > a.out/3062 is trying to acquire lock:
> > (&grp->list_mutex){++++}, at: [<ffffffff83d28879>]
> > deliver_to_subscribers sound/core/seq/seq_clientmgr.c:666 [inline]
> > (&grp->list_mutex){++++}, at: [<ffffffff83d28879>]
> > snd_seq_deliver_event+0x279/0x790 sound/core/seq/seq_clientmgr.c:807
> >
> > but task is already holding lock:
> > (&grp->list_mutex){++++}, at: [<ffffffff83d28879>]
> > deliver_to_subscribers sound/core/seq/seq_clientmgr.c:666 [inline]
> > (&grp->list_mutex){++++}, at: [<ffffffff83d28879>]
> > snd_seq_deliver_event+0x279/0x790 sound/core/seq/seq_clientmgr.c:807
> >
> > other info that might help us debug this:
> > Possible unsafe locking scenario:
> >
> > CPU0
> > ----
> > lock(&grp->list_mutex);
> > lock(&grp->list_mutex);
> >
> > *** DEADLOCK ***
> >
> > May be due to missing lock nesting notation
>
> Indeed, this looks more like a simply missing nesting annotation.
> A totally untested patch is below.

FWIW, the official patch with a proper description is below.


thanks,

Takashi

-- 8< --
From: Takashi Iwai <tiwai@xxxxxxx>
Subject: [PATCH] ALSA: seq: Fix nested rwsem annotation for lockdep splat

syzkaller reported the lockdep splat due to the possible deadlock of
grp->list_mutex of each sequencer client object. Actually this is
rather a false-positive report due to the missing nested lock
annotations. The sequencer client may deliver the event directly to
another client which takes own other lock.

For addressing this issue, this patch replaces the simple down_read()
with down_read_nested(). As a lock subclass, the already existing
"hop" can be re-used, which indicates the depth of the call.

Reference: http://lkml.kernel.org/r/089e082686ac9b482e055c832617@xxxxxxxxxx
Reported-by: syzbot <bot+7feb8de6b4d6bf810cf098bef942cc387e79d0ad@xxxxxxxxxxxxxxxxxxxxxxxxx>
Reported-by: Dmitry Vyukov <dvyukov@xxxxxxxxxx>
Cc: <stable@xxxxxxxxxxxxxxx>
Signed-off-by: Takashi Iwai <tiwai@xxxxxxx>
---
sound/core/seq/seq_clientmgr.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/sound/core/seq/seq_clientmgr.c b/sound/core/seq/seq_clientmgr.c
index 6c9cba2166d9..d10c780dfd54 100644
--- a/sound/core/seq/seq_clientmgr.c
+++ b/sound/core/seq/seq_clientmgr.c
@@ -663,7 +663,7 @@ static int deliver_to_subscribers(struct snd_seq_client *client,
if (atomic)
read_lock(&grp->list_lock);
else
- down_read(&grp->list_mutex);
+ down_read_nested(&grp->list_mutex, hop);
list_for_each_entry(subs, &grp->list_head, src_list) {
/* both ports ready? */
if (atomic_read(&subs->ref_count) != 2)
--
2.14.2