Re: KCSAN: data-race in echo_char / n_tty_receive_buf_common

From: Marco Elver
Date: Mon Nov 04 2019 - 08:26:22 EST


On Mon, 4 Nov 2019 at 13:08, Jiri Slaby <jslaby@xxxxxxxx> wrote:
>
> On 04. 11. 19, 12:44, syzbot wrote:
> > Hello,
> >
> > syzbot found the following crash on:
> >
> > HEAD commit: 05f22368 x86, kcsan: Enable KCSAN for x86
> > git tree: https://github.com/google/ktsan.git kcsan
> > console output: https://syzkaller.appspot.com/x/log.txt?x=106bdb60e00000
> > kernel config: https://syzkaller.appspot.com/x/.config?x=87d111955f40591f
> > dashboard link:
> > https://syzkaller.appspot.com/bug?extid=e518b0df8f4e19495d3e
> > compiler: gcc (GCC) 9.0.0 20181231 (experimental)
> >
> > Unfortunately, I don't have any reproducer for this crash yet.
> >
> > IMPORTANT: if you fix the bug, please add the following tag to the commit:
> > Reported-by: syzbot+e518b0df8f4e19495d3e@xxxxxxxxxxxxxxxxxxxxxxxxx
> >
> > ==================================================================
> > BUG: KCSAN: data-race in echo_char / n_tty_receive_buf_common
> >
> > write to 0xffffc9000c433018 of 8 bytes by task 18008 on cpu 0:
> > add_echo_byte drivers/tty/n_tty.c:845 [inline]
>
> IMO it's expected add_echo_byte is called without output_lock. That's
> exactly what the barrier there is for. See:

Agree it should remain lock-free. It's still a data race because we
have 2 racing concurrent *plain* accesses that are subject to compiler
optimizations such as load/store tearing etc (avoided with
READ_ONCE/WRITE_ONCE).

This data race also highlights the increment, although preceded by a
barrier may still be problematic. I think the increment actually needs
to be atomic (e.g. atomic_inc). If the compiler or the arch turns the
ldata->echo_data++ into a non-atomic increment (something like
"ldata->echo_data = ldata->echo_data + 1"), then concurrent
add_echo_byte can overwrite and loose bytes.

> commit ebec3f8f5271139df618ebdf8427e24ba102ba94
> Author: Tetsuo Handa <penguin-kernel@xxxxxxxxxxxxxxxxxxx>
> Date: Sat May 26 09:53:14 2018 +0900
>
> n_tty: Access echo_* variables carefully.
>
> > echo_char+0x14e/0x1c0 drivers/tty/n_tty.c:948
> > n_tty_receive_char_special+0xb5f/0x1c10 drivers/tty/n_tty.c:1339
> > n_tty_receive_buf_fast drivers/tty/n_tty.c:1610 [inline]
> > __receive_buf drivers/tty/n_tty.c:1644 [inline]
> > n_tty_receive_buf_common+0x1844/0x1b00 drivers/tty/n_tty.c:1742
> > n_tty_receive_buf+0x3a/0x50 drivers/tty/n_tty.c:1771
> > tiocsti drivers/tty/tty_io.c:2197 [inline]
> > tty_ioctl+0xb75/0xe10 drivers/tty/tty_io.c:2573
> > vfs_ioctl fs/ioctl.c:46 [inline]
> > file_ioctl fs/ioctl.c:509 [inline]
> > do_vfs_ioctl+0x991/0xc60 fs/ioctl.c:696
> > ksys_ioctl+0xbd/0xe0 fs/ioctl.c:713
> > __do_sys_ioctl fs/ioctl.c:720 [inline]
> > __se_sys_ioctl fs/ioctl.c:718 [inline]
> > __x64_sys_ioctl+0x4c/0x60 fs/ioctl.c:718
> > do_syscall_64+0xcc/0x370 arch/x86/entry/common.c:290
> > entry_SYSCALL_64_after_hwframe+0x44/0xa9
> >
> > read to 0xffffc9000c433018 of 8 bytes by task 7 on cpu 1:
> > flush_echoes drivers/tty/n_tty.c:828 [inline]
> > __receive_buf drivers/tty/n_tty.c:1648 [inline]
> > n_tty_receive_buf_common+0xe3f/0x1b00 drivers/tty/n_tty.c:1742
> > n_tty_receive_buf2+0x3d/0x60 drivers/tty/n_tty.c:1777
> > tty_ldisc_receive_buf+0x71/0xf0 drivers/tty/tty_buffer.c:461
> > tty_port_default_receive_buf+0x87/0xd0 drivers/tty/tty_port.c:38
> > receive_buf drivers/tty/tty_buffer.c:481 [inline]
> > flush_to_ldisc+0x1d5/0x260 drivers/tty/tty_buffer.c:533
> > process_one_work+0x3d4/0x890 kernel/workqueue.c:2269
> > worker_thread+0xa0/0x800 kernel/workqueue.c:2415
> > kthread+0x1d4/0x200 drivers/block/aoe/aoecmd.c:1253
> > ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:352
> >
> > Reported by Kernel Concurrency Sanitizer on:
> > CPU: 1 PID: 7 Comm: kworker/u4:0 Not tainted 5.4.0-rc3+ #0
> > Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
> > Google 01/01/2011
> > Workqueue: events_unbound flush_to_ldisc
> > ==================================================================
>
> thanks,
> --
> js
> suse labs