Re: WARNING in af_alg_make_sg

From: Peter Xu
Date: Wed Apr 08 2020 - 11:12:42 EST


On Wed, Apr 08, 2020 at 05:58:48PM +0800, Hillf Danton wrote:
>
> On Wed, 08 Apr 2020 00:48:13 -0700
> > syzbot found the following crash on:
> >
> > HEAD commit: 763dede1 Merge tag 'for-linus-5.7-rc1' of git://git.kernel..
> > git tree: upstream
> > console output: https://syzkaller.appspot.com/x/log.txt?x=12b919c7e00000
> > kernel config: https://syzkaller.appspot.com/x/.config?x=12205d036cec317f
> > dashboard link: https://syzkaller.appspot.com/bug?extid=3be1a33f04dc782e9fd5
> > compiler: gcc (GCC) 9.0.0 20181231 (experimental)
> > syz repro: https://syzkaller.appspot.com/x/repro.syz?x=142f3b8fe00000
> > C reproducer: https://syzkaller.appspot.com/x/repro.c?x=159bd23fe00000
> >
> > The bug was bisected to:
> >
> > commit 4426e945df588f2878affddf88a51259200f7e29
> > Author: Peter Xu <peterx@xxxxxxxxxx>
> > Date: Thu Apr 2 04:08:49 2020 +0000
> >
> > mm/gup: allow VM_FAULT_RETRY for multiple times
> >
> > bisection log: https://syzkaller.appspot.com/x/bisect.txt?x=1408ea9fe00000
> > final crash: https://syzkaller.appspot.com/x/report.txt?x=1608ea9fe00000
> > console output: https://syzkaller.appspot.com/x/log.txt?x=1208ea9fe00000
> >
> > IMPORTANT: if you fix the bug, please add the following tag to the commit:
> > Reported-by: syzbot+3be1a33f04dc782e9fd5@xxxxxxxxxxxxxxxxxxxxxxxxx
> > Fixes: 4426e945df58 ("mm/gup: allow VM_FAULT_RETRY for multiple times")
> >
> > ------------[ cut here ]------------
> > WARNING: CPU: 1 PID: 7094 at crypto/af_alg.c:404 af_alg_make_sg+0x399/0x400 crypto/af_alg.c:404
> > Kernel panic - not syncing: panic_on_warn set ...
> > CPU: 1 PID: 7094 Comm: syz-executor037 Not tainted 5.6.0-syzkaller #0
> > Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
> > Call Trace:
> > __dump_stack lib/dump_stack.c:77 [inline]
> > dump_stack+0x188/0x20d lib/dump_stack.c:118
> > panic+0x2e3/0x75c kernel/panic.c:221
> > __warn.cold+0x2f/0x35 kernel/panic.c:582
> > report_bug+0x27b/0x2f0 lib/bug.c:195
> > fixup_bug arch/x86/kernel/traps.c:175 [inline]
> > fixup_bug arch/x86/kernel/traps.c:170 [inline]
> > do_error_trap+0x12b/0x220 arch/x86/kernel/traps.c:267
> > do_invalid_op+0x32/0x40 arch/x86/kernel/traps.c:286
> > invalid_op+0x23/0x30 arch/x86/entry/entry_64.S:1027
> > RIP: 0010:af_alg_make_sg+0x399/0x400 crypto/af_alg.c:404
> > Code: 5c 24 2b 31 ff 89 de e8 c5 b9 f8 fd 84 db 74 0e e8 8c b8 f8 fd 48 8b 04 24 48 89 44 24 70 e8 7e b8 f8 fd 0f 0b e8 77 b8 f8 fd <0f> 0b c7 44 24 4c ea ff ff ff e9 4b ff ff ff 48 89 df e8 40 6e 36
> > RSP: 0018:ffffc900018779a0 EFLAGS: 00010293
> > RAX: ffff8880a16b65c0 RBX: ffff8880a4141220 RCX: ffffffff837a763d
> > RDX: 0000000000000000 RSI: ffffffff837a78f9 RDI: 0000000000000005
> > RBP: 000000001fef2254 R08: ffff8880a16b65c0 R09: ffffed10142d6cb9
> > R10: ffff8880a16b65c7 R11: ffffed10142d6cb8 R12: 0000000000000000
> > R13: dffffc0000000000 R14: 0000000000000000 R15: dffffc0000000000
> > hash_sendmsg+0x45c/0xad0 crypto/algif_hash.c:94
> > sock_sendmsg_nosec net/socket.c:652 [inline]
> > sock_sendmsg+0xcf/0x120 net/socket.c:672
> > ____sys_sendmsg+0x6bf/0x7e0 net/socket.c:2362
> > ___sys_sendmsg+0x100/0x170 net/socket.c:2416
> > __sys_sendmsg+0xec/0x1b0 net/socket.c:2449
> > do_syscall_64+0xf6/0x7d0 arch/x86/entry/common.c:295
> > entry_SYSCALL_64_after_hwframe+0x49/0xb3
>
> Make gup feed back correct error code in case of bailout.
>
> --- a/mm/gup.c
> +++ b/mm/gup.c
> @@ -1325,8 +1325,11 @@ retry:
> * start trying again otherwise it can loop forever.
> */
>
> - if (fatal_signal_pending(current))
> + if (fatal_signal_pending(current)) {
> + if (!pages_done)
> + pages_done = -EINTR;
> break;
> + }
>
> *locked = 1;
> down_read(&mm->mmap_sem);
>

CC Thomas too.

Sorry for all these mess...

Frankly speaking I didn't notice get_user_pages_fast() forbids
returning zero while __get_user_pages() allowed it... Ideally I think
the gup callers should check against ret>0 to know exactly how many
valid pages we've got, but it's not an excuse good enough...

Hillf, would you mind kick the syzbot directly next time when post the
fix? I'll make bold to do that for you this time, Thanks!

#syz test: #https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git master

8<--------------------------------------------------------------------