Re: BUG: unable to handle kernel NULL pointer dereference in irq_may_run

From: Eric Biggers
Date: Sat Dec 23 2017 - 15:38:34 EST


On Fri, Dec 22, 2017 at 08:13:47PM +0100, Thomas Gleixner wrote:
> On Thu, 21 Dec 2017, syzbot wrote:
>
> > Hello,
> >
> > syzkaller hit the following crash on 6084b576dca2e898f5c101baef151f7bfdbb606d
> > git://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/master
> > compiler: gcc (GCC) 7.1.1 20170620
> > .config is attached
> > Raw console output is attached.
> > C reproducer is attached
> > syzkaller reproducer is attached. See https://goo.gl/kgGztJ
> > for information about syzkaller reproducers
>
> Unfortunately I cannot reproduce that issue.
>
> > BUG: unable to handle kernel NULL pointer dereference at (null)
> > IP: irqd_has_set kernel/irq/internals.h:230 [inline]
> > IP: irq_may_run+0x19/0x70 kernel/irq/chip.c:506
> > PGD 0 P4D 0
> > Oops: 0000 [#1] SMP
> > Dumping ftrace buffer:
> > (ftrace buffer empty)
> > Modules linked in:
> > CPU: 0 PID: 3177 Comm: kworker/u4:2 Not tainted 4.15.0-rc3-next-20171214+ #67
> > Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google
> > 01/01/2011
> > RIP: 0010:irqd_has_set kernel/irq/internals.h:230 [inline]
>
> So this dereferences
>
> irq_desc->irq_data->common
>
> which is NULL:
>
> 2b:* f7 00 00 00 0c 00 testl $0xc0000,(%rax) <-- trapping instruction
>
> > RIP: 0010:irq_may_run+0x19/0x70 kernel/irq/chip.c:506
> > RSP: 0018:ffff88021fc03f58 EFLAGS: 00010006
> > RAX: 0000000000000000 RBX: ffff8802151fa400 RCX: ffffffff81243385
>
> ^^^^^^^^^^^^^^^^
>
> > RDX: 0000000000010000 RSI: 0000000000000000 RDI: ffff8802151fa400
> > RBP: ffff88021fc03f68 R08: 0000000000000001 R09: 000000000000000c
> > R10: ffff88021fc03ee8 R11: 000000000000000c R12: 0000000000000001
> > R13: ffff8802151fa400 R14: 0000000000000027 R15: 0000000000000000
> > FS: 0000000000000000(0000) GS:ffff88021fc00000(0000) knlGS:0000000000000000
> > CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > CR2: 0000000000000000 CR3: 000000000301e003 CR4: 00000000001606f0
> > DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> > DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> > Call Trace:
> > <IRQ>
> > handle_edge_irq+0x33/0x220 kernel/irq/chip.c:755
> > generic_handle_irq_desc include/linux/irqdesc.h:159 [inline]
> > handle_irq+0x15/0x20 arch/x86/kernel/irq_64.c:77
> > do_IRQ+0x53/0x100 arch/x86/kernel/irq.c:229
> > common_interrupt+0xa9/0xa9 arch/x86/entry/entry_64.S:695
>
> Now what confuses me is the fact that
>
> irq_desc->irq_data->common
>
> is initialized in desc_set_defaults() when the irq descriptor is
> allocated. It's not written to after that. Plus it got dereferenced before.
> So this looks like a stray pointer.
>
> I have no clue how that could be related to the reproducer. Is this
> reproducing 100% on your end? If yes I surely can try to add some debug
> which might help to catch this.
>
> Thanks,
>

This is yet another one where the reproducer is using AF_ALG and binding to an
algorithm using 'pcrypt', so it's running into the pcrypt_free() bug which is
causing slab cache corruption:

https://groups.google.com/forum/#!topic/syzkaller-bugs/NKn_ivoPOpk

https://patchwork.kernel.org/patch/10126761/

So let's mark it as a duplicate:

#syz dup: KASAN: use-after-free Read in __list_del_entry_valid (2)

Eric