Re: BUG: unable to handle kernel paging request in do_futex

From: Thomas Gleixner
Date: Thu Dec 14 2017 - 12:03:04 EST


On Thu, 14 Dec 2017, Andrey Ryabinin wrote:
> On 12/14/2017 06:31 PM, Thomas Gleixner wrote:
> > On Thu, 30 Nov 2017, syzbot wrote:
> >> BUG: unable to handle kernel paging request at 00000000c314149f
> >
> > That's a user space address which is nowhere in the registers. Is that
> > perhaps pre commit: 328b4ed93b69a ?
>
> Seems so. Kernel version is 4.15.0-rc1-next-20171130+ it shouldn't have that commit.
>
> >> IP: arch_futex_atomic_op_inuser arch/x86/include/asm/futex.h:67 [inline]
> >> IP: futex_atomic_op_inuser kernel/futex.c:1588 [inline]
> >> IP: futex_wake_op kernel/futex.c:1637 [inline]
> >> IP: do_futex+0x14c8/0x2280 kernel/futex.c:3483
> >> PGD 5e28067 P4D 5e28067 PUD 5e2a067 PMD 0
> >> Oops: 0002 [#1] SMP KASAN
> >
> > ^^^^ X86_PF_WRITE
> >
> >> Dumping ftrace buffer:
> >> (ftrace buffer empty)
> >> Modules linked in:
> >> CPU: 0 PID: 14626 Comm: syz-executor6 Not tainted 4.15.0-rc1-next-20171130+
> >> #56
> >> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google
> >> 01/01/2011
> >> task: 000000005f17dad6 task.stack: 000000005af7607c
> >> RIP: 0010:arch_futex_atomic_op_inuser arch/x86/include/asm/futex.h:67 [inline]
> >> RIP: 0010:futex_atomic_op_inuser kernel/futex.c:1588 [inline]
> >> RIP: 0010:futex_wake_op kernel/futex.c:1637 [inline]
> >> RIP: 0010:do_futex+0x14c8/0x2280 kernel/futex.c:3483
> >> RSP: 0018:ffff8801cffafa18 EFLAGS: 00010246
> >> RAX: 000000007fffffff RBX: 0000000040000002 RCX: ffffffff8164e3d9
> >> RDX: 0000000000000000 RSI: ffffc900034e8000 RDI: 0000000000000000
> >> RBP: ffff8801cffafe38 R08: 1ffffffff0d31367 R09: 0000000000000004
> >> R10: 0000000000000000 R11: ffffffff8748cd60 R12: ffff8801d0f30180
> >> R13: 0000000020000000 R14: dffffc0000000000 R15: ffff8801cffafe10
> >> FS: 00007f66305e0700(0000) GS:ffff8801db400000(0000) knlGS:0000000000000000
> >> CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> >> CR2: fffffffffffffff8 CR3: 00000001ccc2e000 CR4: 00000000001426f0
> >
> > ^^^^^^^^^^^^^^^^ is a totally different address so its either
> > completely bogus or the above address is a hashed pointer because
> > that printk used to be %p and was changed to %px in 328b4ed93b69a
> >
> >> DR0: 0000000020000000 DR1: 0000000020000000 DR2: 0000000000000000
> >> DR3: 0000000000000000 DR6: 00000000ffff0ff3 DR7: 0000000000bb060a
> >> Call Trace:
> >> SYSC_futex kernel/futex.c:3533 [inline]
> >> SyS_futex+0x260/0x390 kernel/futex.c:3501
> >> entry_SYSCALL_64_fastpath+0x1f/0x96
> >> RIP: 0033:0x4529d9
> >> RSP: 002b:00007f66305dfc58 EFLAGS: 00000212 ORIG_RAX: 00000000000000ca
> >> RAX: ffffffffffffffda RBX: 00007f66305e0700 RCX: 00000000004529d9
> >> RDX: 0000000000000007 RSI: 0000000000000085 RDI: 0000000020062000
> >> RBP: 0000000000000000 R08: 0000000020000000 R09: 0000000040000002
> >> R10: 000000002085fff0 R11: 0000000000000212 R12: 0000000000000000
> >> R13: 0000000000a6f7ff R14: 00007f66305e09c0 R15: 0000000000000000
> >
> > The arguments are:
> >
> > RDI uaddr 0000000020062000
> > RSI op 0000000000000085
> > RDX val 0000000000000007
> > RCX utime 00000000004529d9
> > R8 uaddr2 0000000020000000
> > R9 val2 0000000040000002
> >
> >> Code: 31 d2 0f 1f 00 45 87 65 00 0f 1f 00 89 95 30 fc ff ff e9 1d ff ff ff e8
> >> 67 56 0b 00 31 d2 8b bd 00 fc ff ff 0f 1f 00 41 8b 45 00 <89> c1 31 f9 f0 41
> >> 0f b1 4d 00 75 f0 0f 1f 00 41 89 c4 89 95 30
> >
> > and the code is:
> >
> > 27: 41 8b 45 00 mov 0x0(%r13),%eax
> > 2b:* 89 c1 mov %eax,%ecx <-- trapping instruction
> > 2d: 31 f9 xor %edi,%ecx
> > 2f: f0 41 0f b1 4d 00 lock cmpxchg %ecx,0x0(%r13)
> > 35: 75 f0 jne 0x27
> >
> > The trapping instruction cannot trap :). Assumed it's the move before that,
> > then the accessed location is R13 + 0 = 0000000020000000, which is uaddr2
> > and entirely correct.
> >
> But fault address must be 0xfffffffffffffff8 as per CR2, so it can't be
> 'mov 0x0(%r13),%eax' either. Right?

Indeed

> > And what I completely fail to understand why this triggers at all. That
> > code section is guarded by an extable fixup so this should never come in.
> >
> > Is this a KASAN artifact?
> >
> I don't see any evidence for KASAN being involved here.

I was just asking because of:

>> Oops: 0002 [#1] SMP KASAN

Thanks,

tglx