Re: BUG: unable to handle kernel paging request in do_futex
From: Andrey Ryabinin
Date: Thu Dec 14 2017 - 11:36:07 EST
On 12/14/2017 06:31 PM, Thomas Gleixner wrote:
> On Thu, 30 Nov 2017, syzbot wrote:
>> Hello,
>>
>> syzkaller hit the following crash on 11fed7829beff10184503fd65e5919926464601a
>> git://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/master
>> compiler: gcc (GCC) 7.1.1 20170620
>> .config is attached
>> Raw console output is attached.
>>
>> Unfortunately, I don't have any reproducer for this bug yet.
>>
>>
>> BUG: unable to handle kernel paging request at 00000000c314149f
>
> That's a user space address which is nowhere in the registers. Is that
> perhaps pre commit: 328b4ed93b69a ?
Seems so. Kernel version is 4.15.0-rc1-next-20171130+ it shouldn't have that commit.
>> IP: arch_futex_atomic_op_inuser arch/x86/include/asm/futex.h:67 [inline]
>> IP: futex_atomic_op_inuser kernel/futex.c:1588 [inline]
>> IP: futex_wake_op kernel/futex.c:1637 [inline]
>> IP: do_futex+0x14c8/0x2280 kernel/futex.c:3483
>> PGD 5e28067 P4D 5e28067 PUD 5e2a067 PMD 0
>> Oops: 0002 [#1] SMP KASAN
>
> ^^^^ X86_PF_WRITE
>
>> Dumping ftrace buffer:
>> (ftrace buffer empty)
>> Modules linked in:
>> CPU: 0 PID: 14626 Comm: syz-executor6 Not tainted 4.15.0-rc1-next-20171130+
>> #56
>> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google
>> 01/01/2011
>> task: 000000005f17dad6 task.stack: 000000005af7607c
>> RIP: 0010:arch_futex_atomic_op_inuser arch/x86/include/asm/futex.h:67 [inline]
>> RIP: 0010:futex_atomic_op_inuser kernel/futex.c:1588 [inline]
>> RIP: 0010:futex_wake_op kernel/futex.c:1637 [inline]
>> RIP: 0010:do_futex+0x14c8/0x2280 kernel/futex.c:3483
>> RSP: 0018:ffff8801cffafa18 EFLAGS: 00010246
>> RAX: 000000007fffffff RBX: 0000000040000002 RCX: ffffffff8164e3d9
>> RDX: 0000000000000000 RSI: ffffc900034e8000 RDI: 0000000000000000
>> RBP: ffff8801cffafe38 R08: 1ffffffff0d31367 R09: 0000000000000004
>> R10: 0000000000000000 R11: ffffffff8748cd60 R12: ffff8801d0f30180
>> R13: 0000000020000000 R14: dffffc0000000000 R15: ffff8801cffafe10
>> FS: 00007f66305e0700(0000) GS:ffff8801db400000(0000) knlGS:0000000000000000
>> CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>> CR2: fffffffffffffff8 CR3: 00000001ccc2e000 CR4: 00000000001426f0
>
> ^^^^^^^^^^^^^^^^ is a totally different address so its either
> completely bogus or the above address is a hashed pointer because
> that printk used to be %p and was changed to %px in 328b4ed93b69a
>
>> DR0: 0000000020000000 DR1: 0000000020000000 DR2: 0000000000000000
>> DR3: 0000000000000000 DR6: 00000000ffff0ff3 DR7: 0000000000bb060a
>> Call Trace:
>> SYSC_futex kernel/futex.c:3533 [inline]
>> SyS_futex+0x260/0x390 kernel/futex.c:3501
>> entry_SYSCALL_64_fastpath+0x1f/0x96
>> RIP: 0033:0x4529d9
>> RSP: 002b:00007f66305dfc58 EFLAGS: 00000212 ORIG_RAX: 00000000000000ca
>> RAX: ffffffffffffffda RBX: 00007f66305e0700 RCX: 00000000004529d9
>> RDX: 0000000000000007 RSI: 0000000000000085 RDI: 0000000020062000
>> RBP: 0000000000000000 R08: 0000000020000000 R09: 0000000040000002
>> R10: 000000002085fff0 R11: 0000000000000212 R12: 0000000000000000
>> R13: 0000000000a6f7ff R14: 00007f66305e09c0 R15: 0000000000000000
>
> The arguments are:
>
> RDI uaddr 0000000020062000
> RSI op 0000000000000085
> RDX val 0000000000000007
> RCX utime 00000000004529d9
> R8 uaddr2 0000000020000000
> R9 val2 0000000040000002
>
>> Code: 31 d2 0f 1f 00 45 87 65 00 0f 1f 00 89 95 30 fc ff ff e9 1d ff ff ff e8
>> 67 56 0b 00 31 d2 8b bd 00 fc ff ff 0f 1f 00 41 8b 45 00 <89> c1 31 f9 f0 41
>> 0f b1 4d 00 75 f0 0f 1f 00 41 89 c4 89 95 30
>
> and the code is:
>
> 27: 41 8b 45 00 mov 0x0(%r13),%eax
> 2b:* 89 c1 mov %eax,%ecx <-- trapping instruction
> 2d: 31 f9 xor %edi,%ecx
> 2f: f0 41 0f b1 4d 00 lock cmpxchg %ecx,0x0(%r13)
> 35: 75 f0 jne 0x27
>
> The trapping instruction cannot trap :). Assumed it's the move before that,
> then the accessed location is R13 + 0 = 0000000020000000, which is uaddr2
> and entirely correct.
>
But fault address must be 0xfffffffffffffff8 as per CR2, so it can't be 'mov 0x0(%r13),%eax' either. Right?
> And what I completely fail to understand why this triggers at all. That
> code section is guarded by an extable fixup so this should never come in.
>
> Is this a KASAN artifact?
>
I don't see any evidence for KASAN being involved here.
> Thanks,
>
> tglx
>