Re: BUG: sleeping function called from invalid context at arch/x86/mm/fault.c:LINE
From: Dmitry Vyukov
Date: Fri Dec 01 2017 - 03:20:08 EST
On Thu, Nov 30, 2017 at 9:41 PM, Eric Biggers <ebiggers3@xxxxxxxxx> wrote:
> On Thu, Nov 30, 2017 at 11:55:00AM -0800, syzbot wrote:
>> Call Trace:
>> __dump_stack lib/dump_stack.c:17 [inline]
>> dump_stack+0x194/0x257 lib/dump_stack.c:53
>> ___might_sleep+0x2b2/0x470 kernel/sched/core.c:6060
>> __might_sleep+0x95/0x190 kernel/sched/core.c:6013
>> __do_page_fault+0x350/0xc90 arch/x86/mm/fault.c:1372
>> do_page_fault+0xee/0x720 arch/x86/mm/fault.c:1504
>> page_fault+0x22/0x30 arch/x86/entry/entry_64.S:1094
>> RIP: 0010:virt_to_cache mm/slab.c:400 [inline]
>> RIP: 0010:kfree+0xb2/0x250 mm/slab.c:3802
>> RSP: 0018:ffff8801cc82f780 EFLAGS: 00010046
>> RAX: 0000000000000000 RBX: ffff8801cc82f948 RCX: ffffffffffffffff
>> RDX: ffffea0007320bc0 RSI: 0000000000000000 RDI: ffff8801cc82f948
>> RBP: ffff8801cc82f7a0 R08: ffffed003a54e4dc R09: 0000000000000000
>> R10: 0000000000000001 R11: ffffed003a54e4db R12: 0000000000000286
>> R13: 0000000000000000 R14: ffff8801cc82f948 R15: ffff8801cc82f8b0
>> blkcipher_walk_done+0x72b/0xde0 crypto/blkcipher.c:139
>> encrypt+0x50a/0xaf0 crypto/salsa20_generic.c:208
>> skcipher_crypt_blkcipher crypto/skcipher.c:622 [inline]
>> skcipher_decrypt_blkcipher+0x213/0x310 crypto/skcipher.c:640
>> crypto_skcipher_decrypt include/crypto/skcipher.h:463 [inline]
>> _skcipher_recvmsg crypto/algif_skcipher.c:144 [inline]
>> skcipher_recvmsg+0xa54/0xf20 crypto/algif_skcipher.c:165
>> sock_recvmsg_nosec net/socket.c:805 [inline]
>> sock_recvmsg+0xc9/0x110 net/socket.c:812
>> ___sys_recvmsg+0x29b/0x630 net/socket.c:2207
>> __sys_recvmsg+0xe2/0x210 net/socket.c:2252
>> SYSC_recvmsg net/socket.c:2264 [inline]
>> SyS_recvmsg+0x2d/0x50 net/socket.c:2259
>> entry_SYSCALL_64_fastpath+0x1f/0x96
>
> Yet another duplicate of the Salsa20 bug:
>
> #syz dup: WARNING: suspicious RCU usage (3)
>
> Looks like this one was incorrectly attributed to x86 rather than crypto?
+Andrey, please check why syzbot has attributed this to x86.
> kfree() is being called with preempt_count corrupted *and* with an uninitialized
> pointer, so it can cause quite a few different problems...
Yeah, it's a bad one. Hopefully fixes will start to propagate soon.