Re: general protection fault in n_tty_receive_buf_common

From: Jiri Slaby
Date: Fri Oct 27 2017 - 10:08:10 EST


This is a multi-part message in MIME format.On 10/27/2017, 11:24 AM, Dmitry Vyukov wrote:
> On Fri, Oct 27, 2017 at 11:22 AM, syzbot
> <bot+7fde9fa6e982d17b9acf978961e059b0a5344719@xxxxxxxxxxxxxxxxxxxxxxxxx>
> wrote:
>> Hello,
>>
>> syzkaller hit the following crash on
>> 623ce3456671ea842c0ebda79c38655c8c04af74
>> git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/master
>> compiler: gcc (GCC) 7.1.1 20170620
>> .config is attached
>> Raw console output is attached.
>
> A more recent report is on upstream 0787643a5f6aad1f0cdeb305f7fe492b71943ea4
>
> kasan: GPF could be caused by NULL-ptr deref or user memory access
> general protection fault: 0000 [#1] SMP KASAN
> Dumping ftrace buffer:
> (ftrace buffer empty)
> Modules linked in:
> CPU: 1 PID: 59 Comm: kworker/u4:2 Not tainted 4.14.0-rc5+ #142
> Hardware name: Google Google Compute Engine/Google Compute Engine,
> BIOS Google 01/01/2011
> Workqueue: events_unbound flush_to_ldisc
> task: ffff8801d9aa6700 task.stack: ffff8801d9ab0000
> RIP: 0010:__read_once_size include/linux/compiler.h:276 [inline]
> RIP: 0010:n_tty_receive_buf_common+0x154/0x2520 drivers/tty/n_tty.c:1690
> RSP: 0018:ffff8801d9ab7130 EFLAGS: 00010202
> RAX: 000000000000044c RBX: ffff8801d4cb0400 RCX: 0000000000000000
> RDX: 0000000000000000 RSI: ffff8801d4cb07a8 RDI: 0000000000002260
> RBP: ffff8801d9ab72d8 R08: ffffffff82721563 R09: 0000000000000005
> R10: ffff8801d9ab6f68 R11: 0000000000000002 R12: ffff8801d28ed27f
> R13: 0000000000000000 R14: dffffc0000000000 R15: ffff8801d28ed27f
> FS: 0000000000000000(0000) GS:ffff8801db300000(0000) knlGS:0000000000000000
> CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 00007f97bb6e6330 CR3: 00000001c5959000 CR4: 00000000001426e0
> Call Trace:
> n_tty_receive_buf2+0x33/0x40 drivers/tty/n_tty.c:1746
> tty_ldisc_receive_buf+0xa7/0x180 drivers/tty/tty_buffer.c:455
> tty_port_default_receive_buf+0x106/0x160 drivers/tty/tty_port.c:37
> receive_buf drivers/tty/tty_buffer.c:474 [inline]
> flush_to_ldisc+0x3c4/0x590 drivers/tty/tty_buffer.c:523
> process_one_work+0xbf0/0x1bc0 kernel/workqueue.c:2119
> worker_thread+0x223/0x1860 kernel/workqueue.c:2253
> kthread+0x35e/0x430 kernel/kthread.c:231
> ret_from_fork+0x2a/0x40 arch/x86/entry/entry_64.S:431
> Code: 89 85 18 ff ff ff 48 8d 45 98 48 89 85 e0 fe ff ff e8 61 a8 fb
> fe 48 8b 85 18 ff ff ff c6 00 00 48 8b 85 a0 fe ff ff 48 c1 e8 03 <42>
> 80 3c 30 00 0f 85 6d 1e 00 00 48 8b 85 28 ff ff ff 4c 8b a8

This decodes as:
0: 89 85 18 ff ff ff mov %eax,-0xe8(%rbp)
6: 48 8d 45 98 lea -0x68(%rbp),%rax
a: 48 89 85 e0 fe ff ff mov %rax,-0x120(%rbp)
11: e8 61 a8 fb fe callq 0xfffffffffefba877
16: 48 8b 85 18 ff ff ff mov -0xe8(%rbp),%rax
1d: c6 00 00 movb $0x0,(%rax)
20: 48 8b 85 a0 fe ff ff mov -0x160(%rbp),%rax
27: 48 c1 e8 03 shr $0x3,%rax
2b:* 42 80 3c 30 00 cmpb $0x0,(%rax,%r14,1)
<-- trapping instruction
30: 0f 85 6d 1e 00 00 jne 0x1ea3
36: 48 8b 85 28 ff ff ff mov -0xd8(%rbp),%rax

So KASAN is checking 0x44c in the shadow. This is 0x2260 in normal
memory. 0x2260 is the offset of read_tail in struct n_tty_data aka ldata
in this function. So ldata (i.e. tty->ldisc_data) is NULL.

This means ->receive_buf was called before n_tty_open proceeded (or
finished).

Coincidently, Yuhong Tao is seeing a similar issue and tried to fix this
in his "tty: fix flush_to_ldisc() oops before tty_open is done" the day
before yesterday. The patch is not correct though.

At this point I am curious, what is the driver behind the failing tty?
And why and who queued flush_to_ldics at this point? May I ask you to
apply the patch from the attachment? It is only for debugging, but I am
running my debug kernels with it for quite some time already (almost 7
years, apparently 8-)).

thanks,
--
js
suse labs