Re: general protection fault in relay_open_buf

From: Greg KH
Date: Thu Jan 31 2019 - 05:44:48 EST


On Thu, Jan 31, 2019 at 10:54:18PM +1300, Kees Cook wrote:
> On Thu, Jan 31, 2019 at 7:53 AM syzbot
> <syzbot+16c3a70e1e9b29346c43@xxxxxxxxxxxxxxxxxxxxxxxxx> wrote:
> >
> > Hello,
> >
> > syzbot found the following crash on:
> >
> > HEAD commit: 02495e76ded5 Add linux-next specific files for 20190130
> > git tree: linux-next
> > console output: https://syzkaller.appspot.com/x/log.txt?x=12cf10df400000
> > kernel config: https://syzkaller.appspot.com/x/.config?x=a2b2e9c0bc43c14d
> > dashboard link: https://syzkaller.appspot.com/bug?extid=16c3a70e1e9b29346c43
> > compiler: gcc (GCC) 9.0.0 20181231 (experimental)
> > syz repro: https://syzkaller.appspot.com/x/repro.syz?x=13266698c00000
> > C reproducer: https://syzkaller.appspot.com/x/repro.c?x=1715bb64c00000
> >
> > IMPORTANT: if you fix the bug, please add the following tag to the commit:
> > Reported-by: syzbot+16c3a70e1e9b29346c43@xxxxxxxxxxxxxxxxxxxxxxxxx
> >
> > kasan: CONFIG_KASAN_INLINE enabled
> > kasan: GPF could be caused by NULL-ptr deref or user memory access
> > general protection fault: 0000 [#1] PREEMPT SMP KASAN
> > CPU: 0 PID: 8092 Comm: syz-executor405 Not tainted 5.0.0-rc4-next-20190130
> > #22
> > Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
> > Google 01/01/2011
> > RIP: 0010:relay_set_buf_dentry kernel/relay.c:412 [inline]
>
> static inline void relay_set_buf_dentry(struct rchan_buf *buf,
> struct dentry *dentry)
> {
> buf->dentry = dentry;
> d_inode(buf->dentry)->i_size = buf->early_bytes; <--
> }
>
> Doing a bisect landed on this:
>
> ff9fb72bc07705c00795ca48631f7fffe24d2c6b ("debugfs: return error
> values, not NULL")
>
> If I revert this patch, I can't reproduce any more. I don't see a
> relationship, though...
>
> My crash appears as:
> [ 121.934378] BUG: unable to handle kernel NULL pointer dereference
> at 0000000000000047
> [ 121.937187] #PF error: [normal kernel read fault]
> [ 121.938824] PGD 800000041f699067 P4D 800000041f699067 PUD 42d08f067 PMD 0
> [ 121.941166] Oops: 0000 [#1] SMP PTI
> [ 121.942381] CPU: 2 PID: 3134 Comm: relay Not tainted
> 5.0.0-rc4-next-20190130 #1020
> [ 121.943873] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996),
> BIOS 1.10.2-1ubuntu1 04/01/2014
> [ 121.945395] RIP: 0010:relay_open_buf.part.10+0x2b8/0x330
> ...
> [ 121.960021] Call Trace:
> [ 121.960453] relay_open+0x18e/0x2c0
> [ 121.961070] __blk_trace_setup+0x1af/0x350
> [ 121.961777] blk_trace_ioctl+0x93/0x100
>
>
> $ ./scripts/faddr2line vmlinux relay_open_buf.part.10+0x2b8/0x330
> relay_open_buf.part.10+0x2b8/0x330:
> relay_set_buf_dentry at kernel/relay.c:412
> (inlined by) relay_open_buf at kernel/relay.c:458
>
> So it's the same location, but not sure about 0x47 offset. d_inode is
> 0x58 from dentry. And i_size is 0x50 from inode. If this isn't NULL,
> but rather an ERR_PTR, the errno is either:
>
> EBADF 9 Bad file descriptor
> EEXIST 17 File exists
>
> Neither are used in the debugfs patch, but debugfs is clearly used in
> do_blk_trace_setup():
>
> if (!blk_debugfs_root)
> return -ENOENT;
> ...
> dir = debugfs_lookup(buts->name, blk_debugfs_root);
> if (!dir)
> bt->dir = dir = debugfs_create_dir(buts->name,
> blk_debugfs_root);
> if (!dir)
> goto err;
> ...
> bt->rchan = relay_open("trace", dir, buts->buf_size,
> buts->buf_nr, &blk_relay_callbacks, bt);
>
> Which is confirmed by the next line in my traceback:
>
> $ ./scripts/faddr2line vmlinux __blk_trace_setup+0x1af/0x350
> __blk_trace_setup+0x1af/0x350:
> do_blk_trace_setup at kernel/trace/blktrace.c:534
> (inlined by) __blk_trace_setup at kernel/trace/blktrace.c:577


Ugh. Yes, this makes sense. Someone is setting up the relay code to
create a debugfs file and only checked for NULL on the return value
(which would have blown up without this patch if debugfs was not
enabled...)

Let me go fix this...

thanks,

greg k-h