Re: WARNING in percpu_ref_exit (2)

From: Jens Axboe
Date: Sat Dec 21 2019 - 09:02:30 EST


On 12/21/19 6:43 AM, Hillf Danton wrote:
>
> On Sat, 21 Dec 2019 00:05:07 -0800
>> Hello,
>>
>> syzbot found the following crash on:
>>
>> HEAD commit: 7ddd09fc Add linux-next specific files for 20191220
>> git tree: linux-next
>> console output: https://syzkaller.appspot.com/x/log.txt?x=12a18cc6e00000
>> kernel config: https://syzkaller.appspot.com/x/.config?x=f183b01c3088afc6
>> dashboard link: https://syzkaller.appspot.com/bug?extid=8c4a14856e657b43487c
>> compiler: gcc (GCC) 9.0.0 20181231 (experimental)
>> syz repro: https://syzkaller.appspot.com/x/repro.syz?x=14b8f351e00000
>> C reproducer: https://syzkaller.appspot.com/x/repro.c?x=14b51925e00000
>>
>> IMPORTANT: if you fix the bug, please add the following tag to the commit:
>> Reported-by: syzbot+8c4a14856e657b43487c@xxxxxxxxxxxxxxxxxxxxxxxxx
>>
>> ------------[ cut here ]------------
>> WARNING: CPU: 1 PID: 11482 at lib/percpu-refcount.c:111
>> percpu_ref_exit+0xab/0xd0 lib/percpu-refcount.c:111
>> Kernel panic - not syncing: panic_on_warn set ...
>> CPU: 1 PID: 11482 Comm: syz-executor051 Not tainted
>> 5.5.0-rc2-next-20191220-syzkaller #0
>> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
>> Google 01/01/2011
>> Call Trace:
>> __dump_stack lib/dump_stack.c:77 [inline]
>> dump_stack+0x197/0x210 lib/dump_stack.c:118
>> panic+0x2e3/0x75c kernel/panic.c:221
>> __warn.cold+0x2f/0x3e kernel/panic.c:582
>> report_bug+0x289/0x300 lib/bug.c:195
>> fixup_bug arch/x86/kernel/traps.c:174 [inline]
>> fixup_bug arch/x86/kernel/traps.c:169 [inline]
>> do_error_trap+0x11b/0x200 arch/x86/kernel/traps.c:267
>> do_invalid_op+0x37/0x50 arch/x86/kernel/traps.c:286
>> invalid_op+0x23/0x30 arch/x86/entry/entry_64.S:1027
>> RIP: 0010:percpu_ref_exit+0xab/0xd0 lib/percpu-refcount.c:111
>> Code: 00 00 00 00 fc ff df 48 c1 ea 03 80 3c 02 00 75 1d 48 c7 43 08 03 00
>> 00 00 e8 01 41 e5 fd 5b 41 5c 41 5d 5d c3 e8 f5 40 e5 fd <0f> 0b eb bf 4c
>> 89 ef e8 29 2c 23 fe eb d9 e8 82 2b 23 fe eb a7 4c
>> RSP: 0018:ffffc9000cb17968 EFLAGS: 00010293
>> RAX: ffff8880a3390640 RBX: ffff8880a83a8010 RCX: ffffffff83901432
>> RDX: 0000000000000000 RSI: ffffffff8390149b RDI: ffff8880a83a8028
>> RBP: ffffc9000cb17980 R08: ffff8880a3390640 R09: 0000000000000000
>> R10: 0000000000000000 R11: 0000000000000000 R12: 0000607f51435750
>> R13: ffff8880a83a8018 R14: ffff888097b95000 R15: ffff888097b95228
>> io_sqe_files_unregister+0x7d/0x2f0 fs/io_uring.c:4623
>> io_ring_ctx_free fs/io_uring.c:5575 [inline]
>> io_ring_ctx_wait_and_kill+0x430/0x9a0 fs/io_uring.c:5644
>> io_uring_release+0x42/0x50 fs/io_uring.c:5652
>> __fput+0x2ff/0x890 fs/file_table.c:280
>> ____fput+0x16/0x20 fs/file_table.c:313
>> task_work_run+0x145/0x1c0 kernel/task_work.c:113
>> exit_task_work include/linux/task_work.h:22 [inline]
>> do_exit+0x909/0x2f20 kernel/exit.c:797
>> do_group_exit+0x135/0x360 kernel/exit.c:895
>> get_signal+0x47c/0x24f0 kernel/signal.c:2734
>> do_signal+0x87/0x1700 arch/x86/kernel/signal.c:815
>> exit_to_usermode_loop+0x286/0x380 arch/x86/entry/common.c:160
>> prepare_exit_to_usermode arch/x86/entry/common.c:195 [inline]
>> syscall_return_slowpath arch/x86/entry/common.c:278 [inline]
>> do_syscall_64+0x676/0x790 arch/x86/entry/common.c:304
>> entry_SYSCALL_64_after_hwframe+0x49/0xbe
>
> Flush work before killing.
>
> --- a/fs/io_uring.c
> +++ b/fs/io_uring.c
> @@ -4618,10 +4618,10 @@ static int io_sqe_files_unregister(struc
> if (!data)
> return -ENXIO;
>
> + flush_work(&data->ref_work);
> percpu_ref_kill_and_confirm(&data->refs, io_file_ref_kill);
> wait_for_completion(&data->done);
> percpu_ref_exit(&data->refs);
> - flush_work(&data->ref_work);
>
> __io_sqe_files_unregister(ctx);
> nr_tables = DIV_ROUND_UP(ctx->nr_user_files, IORING_MAX_FILES_TABLE);

Oh indeed, good catch! Thanks, I'll fold this in.

--
Jens Axboe