Re: fs: WARNING in locks_unlink_lock_ctx (not holding proper lock)

From: Jeff Layton
Date: Fri Oct 07 2016 - 19:26:53 EST


On Fri, 2016-10-07 at 22:03 +0200, Dmitry Vyukov wrote:
> Hello,
>
> I am hitting lots of the following warnings while running syzkaller
> fuzzer. Seems that path does not hold proper lock.
>
> WARNING: CPU: 1 PID: 12090 at fs/locks.c:610 locks_unlink_lock_ctx+0x2c7/0x370
> CPU: 1 PID: 12090 Comm: syz-executor Not tainted 4.8.0+ #28
> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
> ffff880038ba7728 ffffffff82d2b849 ffffffff00000016 fffffbfff10971e8
> ffffffff86e8c000 ffff880038ba7800 ffffffff86f42400 dffffc0000000000
> 0000000000000009 ffff880038ba77f0 ffffffff816a229a 0000000041b58ab3
> Call Trace:
> [< inline >] __dump_stack lib/dump_stack.c:15
> [<ffffffff82d2b849>] dump_stack+0x12e/0x185 lib/dump_stack.c:51
> [<ffffffff816a229a>] panic+0x1e9/0x3f4 kernel/panic.c:153
> [<ffffffff81354fb9>] __warn+0x1c9/0x1e0 kernel/panic.c:509
> [<ffffffff813551a1>] warn_slowpath_null+0x31/0x40 kernel/panic.c:552
> [< inline >] locks_delete_global_locks fs/locks.c:610
> [<ffffffff8193b247>] locks_unlink_lock_ctx+0x2c7/0x370 fs/locks.c:739
> [<ffffffff8193b30f>] locks_delete_lock_ctx+0x1f/0x80 fs/locks.c:751
> [<ffffffff8193d329>] lease_modify+0x229/0x2e0 fs/locks.c:1370
> [< inline >] locks_remove_lease fs/locks.c:2528
> [<ffffffff81947408>] locks_remove_file+0x2d8/0x380 fs/locks.c:2551
> [<ffffffff8182eea6>] __fput+0x1a6/0x780 fs/file_table.c:200
> [<ffffffff8182f50a>] ____fput+0x1a/0x20 fs/file_table.c:244
> [<ffffffff813bae68>] task_work_run+0xf8/0x170 kernel/task_work.c:116
> [< inline >] exit_task_work include/linux/task_work.h:21
> [<ffffffff81364de4>] do_exit+0x864/0x2ad0 kernel/exit.c:828
> [<ffffffff813671cd>] do_group_exit+0x10d/0x330 kernel/exit.c:931
> [<ffffffff8138a57f>] get_signal+0x62f/0x15e0 kernel/signal.c:2307
> [<ffffffff811cf344>] do_signal+0x84/0x18f0 arch/x86/kernel/signal.c:807
> [<ffffffff8100629b>] exit_to_usermode_loop+0x13b/0x200
> arch/x86/entry/common.c:156
> [< inline >] prepare_exit_to_usermode arch/x86/entry/common.c:190
> [< inline >] syscall_return_slowpath arch/x86/entry/common.c:259
> [<ffffffff81008a4f>] do_syscall_64+0x49f/0x620 arch/x86/entry/common.c:285
>
> On commit a6930aaee06755d1bdcfd943fbf614e4d92bb0c7 (Oct 5).

(cc'ing Peter...)

Well spotted. Yeah, I think you're right. The assertion is this:

  percpu_rwsem_assert_held(&file_rwsem);

I'm guessing this is probably fallout from the lglock to rwsem
conversion (commitÂaba376607383).

>From a quick glance, I think we probably just need to down_read the
file_rwsem in locks_remove_lease, prior to taking the flc_lock, and
release it just afterward. I do want to go over the code a little more
closely though to make sure other codepaths aren't missing that lock
though.

Thanks,
--
Jeff Layton <jlayton@xxxxxxxxxxxxxxx>